Data Sets

Datasets were searched for using the same filters as before - must be human tissue that expressed lewy bodies, must have at least 3 patients and 3 controls, and must…

10 datasets were identified:

CodeID Tissue/Cell type Variant Samples Type of array GEO Ref Comments PIs
DIJ Substantia Nigra not specified 15 disease, 8 control Affymetrix Human Genome U133 Plus 2.0 Array GSE49036 Has some Braak staging Dijkstra AA, Ingrassia AI, de Menezes RX, van Kesteren RE, Rozemuller AM, Heutink P, vandeBerg WJ
FFR Substantia Nigra not specified 9 replicates for the controls and 16 replicates for the Parkinson’s disease patients Affymetrix Human Genome U133 Plus 2.0 Array GSE7621 Ffrench-Mullen JM
MOR.SN substantia nigra, split into medial and lateral portions sporadic 15 samples of medial parkinsonian SN, 9 samples of lateral parkinsonian SN (24), 8 medial nigra control samples and 7 lateral nigra control samples(15) Affymetrix Human Genome U133A Array GSE8397 Age and sex Moran LB, Graeber MB
MID1 Substantia Nigra pars compacta not specified 10 PD brain samples and 8 control Affymetrix Human Genome U133 Plus 2.0 Array GSE20141 Middleton FA, Kim PD, Zhang-James Y, Davis RL
MID2 whole substantia nigra not specified 18 controls, 11 patients Affymetrix Human Genome U133A Array GSE20292 Age and sex Middleton FA, James M, Zhang Y, Davis RL
DUM Cortex (BA9) not specified 29 PD, 44 neurologically normal controls Illumina HiSeq 2000 GSE68719 Have age of death, sex, and some samples have corresponding proteomics Dumitriu A, Golji J, Labadorf AT, Gao B, Beach TG, Myers RH, Longo KA, Latourelle JC
MOR.FC Frontal cortex sporadic 3 Controls 5 Patients Affymetrix Human Genome U133A Array GSE8397 Age and sex Moran LB, Graeber MB
LEW two regions of the medulla not specified 14 PD samples and 8 controls (two regions) Affymetrix Human Genome U133A 2.0 Array GSE19587 Age and sex lewandowski nm, small sa
MID3 prefrontal cortex not specified 15 controls 14 patients Affymetrix Human Genome U133A Array GSE20168 Age and sex Middleton FA, James M, Zhang Y, Davis RL
MID4 putamen not specified 20 controls and 15 patients Affymetrix Human Genome U133A Array GSE20291 Age and sex Middleton FA, James M, Zhang Y, Davis RL

Quality Control - PD_QC.R

Checked data distribution using PCA and boxplots. Some of the datasets have quite mixed samples wereas others separate quite nicely into controls and patients. Boxplots didn’t suggest any samples were outliers.

Differential Gene Expression

As with the TDP-43 data, microarray data was analysed using Limma. The genes were divided into positive and negative fold change. Intersected gene lists can be found in /Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/FoldChange.

Commonly Upregulated Genes

~~~ ~~~ ~~~ ~~~ ~~~ ~~~ ~~~ ~~~ ~~~ ~~~ ~~~
A4GNT C3 CPED1 FAM46A HCK KDM4A LY6G5C NRDE2 RBMS2 SMIM14 TNFRSF10B
ABCC6 C3AR1 CPM FAM60A HDC KDM4C LYVE1 NYX RCC1 SNAP23 TNFRSF10D
AC012065.7 C8orf60 CPSF3L FANCE HERC2P3 KIAA0101 LYZL6 OR1F1 RDH16 SND1-IT1 TNFRSF25
ACP5 C9orf3 CROCC FAS HIST1H1T KIAA0226L MAF ORM1 RHBDF2 SOGA1 TNFSF14
ACSBG2 CALML4 CROCCP3 FASLG HIST1H2AB KIAA0485 MAFF OSM RIBC2 SORBS1 TNN
ACSS3 CASP1 CSPG4 FCGBP HIST1H2AD KIAA1614 MAP2K7 P4HA1 RIPK2 SOS2 TP73-AS1
ADH1B CASP2 CTBP2 FCGR2A HMOX1 KIAA1661 MAPKBP1 PAGE1 RNASE6 SOX12 TRAF1
ADIPOQ CASP4 CTD-2269F5.1 FCGR2B HNRNPM KIF13A MAVS PAPOLA RP11-217B7.2 SOX2 TRAPPC10
ADORA3 CASP6 CXCR4 FCRL2 HOXB8 KLHL36 MBD5 PARP16 RP11-255C15.3 SOX9 TRIM21
AF007147 CASP7 CXorf21 FGF17 HP1BP3 KLK2 MBTD1 PAWR RP11-403P17.4 SP140L TRIM5
AGAP10 CASP8 CXorf36 FGF3 HS3ST1 KRI1 MCM3AP-AS1 PCDH12 RP11-548H18.2 SPG21 TTLL5
AGRP CATSPER2 CYP21A1P FGFR1 HSPA1A KRT19P2 MECOM PCDHGA8 RPA4 SPINK1 TUBD1
ALOX5 CCDC101 CYP39A1 FKBP10 HSPA1L L2HGDH MED13L PDLIM1 RPL23AP32 SPTLC3 TULP2
ALOX5AP CCDC15 CYTH4 FLCN HSPA6 LAG3 METTL4 PELI2 RPL23AP53 SRP19 TYMP
ALPK1 CCDC40 DAPP1 FLJ21369 HSPD1 LAMB2 METTL7A PGF RREB1 SRSF5 TYROBP
AMELX CCL17 DBF4B FLT1 IFNA2 LAT2 MFAP5 PGPEP1 RRNAD1 SSTR3 UGGT1
AMELY CCL22 DCLRE1C FLT3LG IFNA21 LECT1 MICB PIDD1 RUNX1-IT1 ST20 UPK3B
AMH CCL27 DDR1-AS1 FMO5 IGF1R LEF1 MID1 PIGO RUNX3 ST6GAL1 USP34
ANGPT2 CD14 DENND2D FNDC3B IGFBP5 LENEP MILR1 PILRA S100A11P1 STAG3L3 VPS54
ANP32A-IT1 CD163 DFFB FOXD1 IGKV1-17 LEPREL1 MIR612 PIM1 S100A4 STIP1 VSIG4
AP1G2 CD207 DIP2A FOXL2 IGLL3P LEPREL4 MKRN4P PLAC8 SAFB2 STK10 WBSCR16
APOBEC3C CD22 DISC1 FPR1 IGSF9B LGALS9 MS4A6A PLEKHF2 SAT1 STOM WDR4
APOC2 CD247 DKFZP434A062 FYB IL12RB1 LILRA2 MSH5 PLGLB1 SCAF4 TAF4 WDR52
AQP8 CD300A DMC1 FZD5 IL16 LINC00115 MSX1 PLIN2 SCD5 TAL1 WDR55
ARHGAP25 CD37 DNA2 FZD7 IL17RB LINC00260 MT1M PLK3 SCIN TAS2R10 WDR78
ARHGDIB CD48 DNAJB1 FZD9 IL18 LINC00894 MYH7B PLOD2 SCNN1B TAS2R13 XRCC2
ARHGEF4 CD68 DNAJB6 G0S2 IL1R1 LINC00963 MYO1C PLTP SDCCAG3 TAZ ZBED2
ARID3A CD84 DNASE2 G3BP1 IL21 LLGL2 MZT2B PODNL1 SERPINA1 TBX6 ZC3HAV1
ATAD2B CDA DOCK10 GABPA INE1 LOC100272216 NABP1 POGLUT1 SERPINH1 TBXAS1 ZFP36L1
ATHL1 CDC14A DOCK2 GAL3ST4 INPP5B LOC100506699 NACAP1 POGZ SERTAD3 TCF3 ZMYND8
ATP8B1 CDK2AP2 DSE GAS1 INPP5D LOC101928198 NBEAL2 PPP1R14D SH2B2 TFAP2B ZNF14
ATXN7 CEBPD DTYMK GATA1 INVS LOC101928274 NBR2 PPP2R1B SHMT1 TFEC ZNF217
AVIL CEL EEF1DP5 GCKR IRF4 LOC101929240 NCKAP1L PRR11 SIGLEC7 TFPI ZNF224
AZGP1 CHD2 EFNA4 GDF15 IRF6 LOC101929889 NEUROG1 PRRG4 SIGLEC8 TGFBR3 ZNF235
AZGP1P1 CHKB-CPT1B EGF GDF9 IRF7 LOC102724229 NFATC3 PSCA SIM2 TGIF1 ZNF34
AZU1 CHRNG ELF4 GEM ITFG2 LOC102725016 NFE2 PSG11 SIX6 TIMM44 ZNF354A
BANP CHST4 EMC10 GJA9-MYCBP ITGB6 LOC284009 NHLH1 PTPN14 SLA TLR5 ZNF430
BC069804 CLEC2B ENTPD1 GJD2 ITPR3 LOC729164 NOTCH2NL PTPRCAP SLAMF8 TM4SF1 ZNF446
BMP8B CLIC2 ENTPD7 GK2 JMJD6 LOC79999 NOX3 RAB13 SLC11A1 TMEM176A ZNF516
BRPF1 CLK1 EP300 GP9 KCNE4 LPAR2 NPIPA1 RAC2 SLC27A3 TMEM19 ZNF665
BTBD18 COL16A1 EPOR GPR25 KCNJ8 LPAR4 NPIPB15 RAD52 SLC2A5 TMEM51 ZNF670
C10orf12 COL5A1 ETV1 GRAMD3 KCNK5 LPP NPL RAD54L SLC35C2 TMPRSS11E ZNF710
C11orf16 COL7A1 FAM106A GTSE1 KCNQ1 LRRC1 NPR3 RBL1 SLC7A9 TMPRSS5 ZNF721
C1QB COL9A1 FAM118A GUSBP3 KCTD5 LRRFIP1 NR6A1 RBM38 SMAD6 TNFAIP8 ZNF74
ZNF783

Commonly Downregulated Genes

~~~ ~~~ ~~~ ~~~ ~~~ ~~~ ~~~ ~~~ ~~~ ~~~ ~~~
ABCA11P ATP5D CLINT1 EIF2B3 GPN1 LOC728392 MZT2A PFDN2 PSMD8 SLC9A1 TSSC1
ABCF1 ATP5F1 CLIP3 EIF2S1 GPX4 LRBA NAPA PFKM PSME3 SLITRK5 TTC19
ABHD11 ATP5G1 CLTB EIF3F GRAMD1B LRPPRC NDEL1 PFN2 PSMG1 SNCA TTC9
ACAT1 ATP6AP1 CMAS EIF3I GSS LYRM4 NDFIP1 PGAM1 PTBP2 SNRPN TUBA1B
ACAT2 ATP6V0B COA3 EIF3K GSTA4 LYRM9 NDUFA10 PGRMC1 PTDSS1 SNX24 TUBA1C
ACP2 ATP6V0C COA7 EIF6 GSTO1 MAGED1 NDUFA13 PHB2 PTP4A1 SORD TUBB
ACTB ATP6V1E1 COMMD9 ELOVL6 HARS MAGED2 NDUFA3 PINK1 PTPRA SPAG16 TUBB2A
ACTR1A ATP6V1F COPS3 ELP3 HERC6 MAK16 NDUFA7 PITPNA RAB14 SPAG7 TUBB3
ADAM23 ATP6V1G2 COPS4 EMC7 HINT1 MAP1B NDUFA9 PITPNB RABEPK SPCS1 TUBG1
AFG3L2 ATP6V1H COPS5 ENDOD1 HLTF MAP1LC3B NDUFAB1 PITRM1 RAN SPCS3 TUFM
AGTPBP1 AURKAIP1 COPS7A ENOPH1 HMCES MAPK10 NDUFAF1 PKI55 RANGRF SPRYD7 TUSC2
AHCY AVPI1 COQ3 ENSA HMGN4 MDH1 NDUFB5 PLD3 RARS SRPRB TXNDC15
AIFM1 B3GNT2 COX7A2L ENTPD6 HNRNPA0 MDH2 NDUFS1 PLEKHB2 RBFOX2 SRRD UBB
AIMP2 BABAM1 COX8A ERCC1 HPCAL4 ME2 NDUFS3 PMPCA RBM3 SSSCA1 UBE2E3
AJAP1 BECN1 CREB3 ERP29 HSPA12A ME3 NDUFS5 PMS2 RITA1 ST3GAL5 UBE3B
AK055981 BEND5 CRMP1 EXOSC9 IARS MECR NDUFV1 PMS2P1 RNASEH1 STAT4 UCHL1
AK5 BEX4 CRY2 FABP3 IARS2 MFN2 NDUFV2 PMS2P5 RNF10 STRAP UQCRC1
AKAP12 BFSP1 CSNK2A1 FAF1 IDH3B MGST3 NEDD8 POLR2C RPA2 STS UQCRFS1
AKAP6 BPGM CSRNP2 FAM127A IDH3G MICU1 NEU1 POLR3C RPH3A STX12 UQCRQ
AKTIP BRE CTNNA2 FAM134C IDI1 MIEF1 NGFRAP1 POP4 RPP14 STXBP1 UROD
ALAS1 BRF2 CUL2 FAM162A IMP4 MIF NHP2 PPFIA2 RPS6KC1 SUMO3 UROS
ALDH1A1 BSN CX3CL1 FAM168A INSIG1 MIR21 NIF3L1 PPIA RRAGA SYN1 USP7
ANXA2 BTBD3 CXorf40A FAM206A IQSEC1 MIR3656 NIPSNAP1 PPME1 RTCB SYT11 UTP18
ANXA6 C11orf49 CXorf40A FBXW2 IRGQ MIR4784 NIT2 PPP1R7 RTF1 TACO1 VAMP7
AP1S1 C12orf10 CXorf40B FDPS ITFG1 MIR636 NME1 PPP2CA RTN4 TCEA2 VLDLR
AP2M1 C1orf216 CYC1 FECH ITPA MIR6890 NOP16 PPP2R1A RWDD2B TCEAL2 VPS41
AP2S1 C21orf33 CYP2E1 FHIT ITPR1 MLH1 NRXN3 PPP2R2B SAMM50 TCEAL4 VPS4B
AP3M2 C3orf62 DCTN3 FIBP KATNB1 MMP24-AS1 NSDHL PPP2R5D SARS TFCP2 VPS51
AP3S2 C5orf30 DCTN6 FIG4 KHDRBS1 MOCS2 NUDCD3 PPP3CB SCAMP5 TFPT WBP11
APEH C6orf106 DCTPP1 FKBP1B KIAA0391 MOSPD1 NUDT2 PRC1 SCFD1 TIMM10 WDR44
APEX1 CCBL2 DDA1 FLRT1 KIAA0513 MPI NUP155 PRKCZ SDHA TIMM10B WDR61
APLP2 CCDC51 DDHD2 FRMPD4 KIAA1279 MRPL15 ODC1 PRMT8 SDHAF1 TIMM8B WDR7
APMAP CCK DDX1 FTO KIF2A MRPL35 ODF2 PRPF4 SDHAP1 TMED3 WDR77
AREL1 CCNH DDX24 FUCA1 KIF3C MRPL4 OPTN PRR13 SEC31A TMEM177 WDYHV1
ARF1 CCSER2 DDX42 GABARAPL1 KIFAP3 MRPS18A ORC5 PSD3 SEH1L TMEM208 YWHAZ
ARF3 CDC123 DGUOK GABBR1 KLHDC4 MRPS22 OSBP PSMA1 SERINC3 TMEM246 ZDHHC4
ARFGAP2 CDC42 DHRS11 GALNT11 L1CAM MRPS33 OXCT1 PSMA5 SEZ6L2 TMEM41B ZDHHC6
ARHGEF9 CDK20 DHRS7B GARS LANCL1 MRPS35 OXLD1 PSMB3 SF3A3 TMEM97 ZMYM4
ARL2BP CDK5 DIABLO GBAS LANCL2 MRPS7 PAAF1 PSMB4 SF3B5 TMUB2 ZNF365
ARMC2-AS1 CDK7 DKK3 GGNBP2 LBH MRTO4 PAFAH1B1 PSMB5 SH3BP5 TOMM20 ZNF593
ARPC1A CDS2 DLD GLO1 LCMT1 MTCH1 PARL PSMB6 SIPA1L1 TOR1AIP2 ZNF629
ARPC5L CERK DNAJC8 GLOD4 LDHA MTCH2 PARN PSMB7 SLC23A2 TOX4
ASS1 CFL1 DRG1 GLT8D1 LDHB MTCL1 PCCB PSMC1 SLC25A11 TP53BP1
ASTN2 CHCHD2 DSTN GMPR2 LDLRAD4 MTHFD1 PCYOX1L PSMC2 SLC25A3 TPGS2
ATG14 CHP1 DUSP26 GNB1 LDOC1 MTPAP PDCL3 PSMC5 SLC25A46 TRAP1
ATP1A1 CHST1 DYNC1H1 GNB5 LETMD1 MTX2 PDHB PSMD1 SLC25A5 TRAPPC2L
ATP1A3 CIAPIN1 DYNC1I1 GOT1 LINC00094 MX1 PDK2 PSMD11 SLC25A6 TSG101
ATP2B2 CIRBP DZIP3 GOT2 LMO3 MYH10 PDZD8 PSMD14 SLC30A9 TSPAN3
ATP5A1 CITED1 EAPP GPD1L LOC101927673 MYL12B PEX14 PSMD2 SLC35B1 TSPAN7
ATP5B CLCN6 EIF2AK1 GPI LOC645166 MYL5 PEX19 PSMD7 SLC6A1 TSR2

Mouse data

Mouse data was sourced from GEO https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE4758, referenced as CON, however during the analysis I realised that alpha synuclein had a positive log fold change, even though every other dataset reported a negative log fold change for sporadic parkinson’s disease. Looking back at the dataset I realised it was an overexpression model rather than a mutant. Woops!

Alpha synuclein expression

What’s interesting is that canonically, alpha synuclein has been suggested to be overexpressed in sporadic parkinson’s patients.

Filtering out other signals

As was done with the TDP-43 pipeline, I had to find other signals that could be contaminating the sporadic signal. I decided to use the ALS datasets from the TDP-43 analysis to represent “general neurodegeneration”, (C9orf72, sALS, PET, RAV, SOD1, FUS) but actually there was very little overlap:

Common Up (ALS)

ALOX5 ALPK1 STOM MAFF COL7A1

Common Down (ALS)

PMS2P1 ATP6V1G2 INSIG1 RRAGA TPGS2 MTCH2 PPP2CA GLO1 PDHB

In a similar vein I decided to source some sporadic Parkinson’s disease blood - to make sure it was leaving a more tissue specific signal. I found two datasets on GEO - GSE99039 (AMA) and GSE72267 (RON). These were bigger datasets than I usually use - AMA was total 353 samples, RON 59. What I realised was that my filters for the presence calls were still 2. This meant that 24,000 genes/probes passed through filtering which was making huge overlaps. I realised a filter of 2 wasn’t really proportional to the number of samples in AMA, so I increased this to 20 (as the dataset is approximately 10 times larger than the other datasets I have been using).

These two datasets produced a common gene expression signature of approximately 5000 genes, which is quite a lot of concordance. When you compare this to the brain signal, there is an overlap of 88 upregulated and 76 downregulated genes. This is quite a small overlap considering the size of the blood signature.

At this point, the remaining gene list for the sporadic signature is 558 genes (look for Filt_ALS_blood files)

Enrichr

Down http://amp.pharm.mssm.edu/Enrichr/enrich?dataset=3rwmj

Up http://amp.pharm.mssm.edu/Enrichr/enrich?dataset=3rwmr

CON DATA

There was a dataset I labelled “CON” with the GSEnumber GSE4758 that I included for a little while as it was a mouse model of alpha synuclein which I later removed as the research question changed.

Adding LRRK2 Data

Oliver Bandmann mentioned that there is a phenomenon that LRRK2 patients exhibit their disease more closely to sporadic patients than other familial patients. I thought this was interesting and could be a good application of my methdology.

I sourced two datasets with LRRK2 patient data - GSE34516 and GSE23290. These were exon array so had to first be preprocessed by Wenbin in the Affymetrix Console using RMA-sketch. The output was normalised expression data. Mas5Calls didn’t seem to work so I skipped this step.

#### BOT GSE34516 #########
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Data/")
Data <- read.csv("GSE34516.RMA-GENE-EXTENDED-EC-hg19-na36_LRRK2only.csv", row.names = 1)

genename <- as.data.frame(Data$GeneSymbol)
rownames(genename) <- rownames(Data)
colnames(genename) <- "Gene.Symbol"
#Data is already normalised using RMA-sketch
analysis.name<-"BOT" #Label analysis
expressionMatrix<-Data[,1:6] #takes expression from normalised expression set


Treat<-factor(rep(c("Control", "Patient"),c(4,2)), levels=c("Control", "Patient"))
design<-model.matrix(~Treat)
rownames(design)<-colnames(expressionMatrix)
design

#Conduct statistical analysis of expression
library(limma)
fit<-lmFit(expressionMatrix, design) #linear model fit
fit<-eBayes(fit) 
result<-topTable(fit, coef="TreatPatient", adjust="BH", number=nrow(expressionMatrix)) #"BH" adjust for multiple hypothesis testing
#toptable normally takes top number but this takes all
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression")
write.csv(result, file = paste(analysis.name, "_result.csv", sep = ""))


result$"Fold Change"<-2^result$logFC 
result$"Fold Change"[result$"Fold Change"<1]<-(-1)/result$"Fold Change"[result$"Fold Change"<1] #converts log fold change into a linear value above or below 0
expressionLinear<-as.data.frame(2^expressionMatrix)
expressionLinear$ProbeSetID<-rownames(expressionLinear)
result<-merge(result, expressionLinear, by = 0) #merge values into one array
result<-merge(result, genename, by.x = "Row.names", by.y = 0)
result<-subset(result, subset=(Gene.Symbol !="---")) #if no gene symbol, discount


setwd(dir = "/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/")
write.csv(result, file=paste(analysis.name, "_finalresult.csv", sep=""), row.names=FALSE, quote = FALSE)

genesort <- result[order(result$P.Value),]
uniqueresult <- genesort[!duplicated(genesort[,16]),]
write.csv(uniqueresult, file=paste(analysis.name, "rankeduniqueresult.csv", sep=""), row.names=FALSE, quote = FALSE)

#### BOT2 GSE23290 #########
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Data/")
Data <- read.csv("GSE23290.RMA-GENE-EXTENDED-EC-hg19-na36_LRRK2only.csv", row.names = 1)

genename <- as.data.frame(Data$GeneSymbol)
rownames(genename) <- rownames(Data)
colnames(genename) <- "Gene.Symbol"
#Data is already normalised using RMA-sketch
analysis.name<-"BOT2" #Label analysis
expressionMatrix<-Data[,1:8] #takes expression from normalised expression set


Treat<-factor(rep(c("Control", "Patient"),c(5,3)), levels=c("Control", "Patient"))
design<-model.matrix(~Treat)
rownames(design)<-colnames(expressionMatrix)
design

#Conduct statistical analysis of expression
library(limma)
fit<-lmFit(expressionMatrix, design) #linear model fit
fit<-eBayes(fit) 
result<-topTable(fit, coef="TreatPatient", adjust="BH", number=nrow(expressionMatrix)) #"BH" adjust for multiple hypothesis testing
#toptable normally takes top number but this takes all
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression")
write.csv(result, file = paste(analysis.name, "_result.csv", sep = ""))


result$"Fold Change"<-2^result$logFC 
result$"Fold Change"[result$"Fold Change"<1]<-(-1)/result$"Fold Change"[result$"Fold Change"<1] #converts log fold change into a linear value above or below 0
expressionLinear<-as.data.frame(2^expressionMatrix)
expressionLinear$ProbeSetID<-rownames(expressionLinear)
result<-merge(result, expressionLinear, by = 0) #merge values into one array
result<-merge(result, genename, by.x = "Row.names", by.y = 0)
result<-subset(result, subset=(Gene.Symbol !="---")) #if no gene symbol, discount


setwd(dir = "/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/")
write.csv(result, file=paste(analysis.name, "_finalresult.csv", sep=""), row.names=FALSE, quote = FALSE)

genesort <- result[order(result$P.Value),]
uniqueresult <- genesort[!duplicated(genesort[,18]),]
write.csv(uniqueresult, file=paste(analysis.name, "rankeduniqueresult.csv", sep=""), row.names=FALSE, quote = FALSE)

I had to skip out a number of steps because it didn’t require the annotation file. There were about 28000 rows left over which was quite a lot, though approximately 7000 were LOC transcripts so took us down to the number of genes you would expect.

After adding these two datasets to the FoldChange_PD.R script, and filtering out for ALS and sporadic PD blood, there were 85 upregulated and 109 downregulated genes. This still included MSX1, SNCA, UCHL1, and ALDH1A1 from PD Malacards but not PINK1 anymore.

EnrichR UP: http://amp.pharm.mssm.edu/Enrichr/enrich?dataset=3t9ff DOWN: http://amp.pharm.mssm.edu/Enrichr/enrich?dataset=3t9fh

Change of direction

The other datasets I had were fibroblasts from PARK2 and LRRK2 patients as provided by Robin Highly/Oliver Bandmann/Heather Mortiboys. These were exon array datasets which were analysed by Wenbin using the Affymetrix console. When these were added to the filters, the number of common genes was 36, only one of which MSX1 was retained.

To give a little more signal, but to also see if the signal could be recovered, I discluded the blood data (as fibroblast is representative enough of a non-neuronal tissue), leaving 54 genes

DOCK2 RAD52 TNFAIP8 P4HA1 MS4A6A CD300A KLHL36 TRIM5 CD163 CXCR4 RIPK2 PAWR ZNF516 JMJD6 CASP7 MSX1 IGF1R LRRC1 SAFB2 C1QB IL17RB LAMB2 CCDC40 CDA FMO5 DNASE2 NFE2 ZNF217 HMOX1 PARP16 HP1BP3 PGPEP1 CD14 NPIPB15 IMP4 LETMD1 AP1S1 FHIT MECR ATP2B2 DCTPP1 WDYHV1 PSMC2 NRXN3 MRTO4 NDUFS3 SLC6A1 NME1 MRPL15 COPS3 TUBG1 PRC1 BFSP1 IARS

Approximately 10 of these genes have a relatively strong relationship with Parkinson’s or LRRK2 specifically. 32 were directly related according to VarElect, 22 indirectly related (VarElect_Results_cmgreen1-sheffield-ac-uk-20180522-014621619).

The next step was to find the PPI partners of these 54 genes.

PPI

#### ALS PARK2 LRRK2 ####

library(biomaRt)

setwd("/Users/clairegreen/Documents/PhD/TDP-43/TDP-43_Code/Results/PPI_Network/")

PPI <- read.table("iref14_Human_UP_noDup_table_nodash.txt", header = T)
braingenes <- read.csv("Zhang_BrainCelltype_Markers_braingenes.csv", header = T)

DEG_list <- readLines("~/Documents/PhD/Parkinsons/Parkinsons_Code/Results/ALSPARK2LRRK2/removeallALLgenes.txt")

mart <- useMart("ENSEMBL_MART_ENSEMBL",dataset="hsapiens_gene_ensembl", host="www.ensembl.org")
attributes <- listAttributes(mart)
mart_back <- getBM(attributes =c("hgnc_symbol", "uniprotswissprot"), filters="hgnc_symbol", values=DEG_list,  mart=mart)
genelist_Uniprot <- subset(mart_back, !(mart_back$uniprotswissprot == ""))
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/ALSPARK2LRRK2/")
write.csv(genelist_Uniprot, "martback.csv", row.names = F)

# IDENTIFY MISSING GENES AND FIND UNIPROT CODES FOR THEM. NB SOME GENES MAY NOT BE PROTEIN CODING #

mart_table <- read.csv("martback.csv", header = T) #A table with the uniprot codes for the DEGs
uniprot_gene <- mart_table$uniprotswissprot

DEG_PPI <- subset(PPI, PPI$V1 %in% uniprot_gene | PPI$V2 %in% uniprot_gene)
rownames(DEG_PPI) <- 1:nrow(DEG_PPI)

write.csv(DEG_PPI, "DEG_PPI_ALP.csv", row.names = F)

DEG_PPI <- read.csv("DEG_PPI_ALP.csv")

DEG_PPI <- subset(DEG_PPI, DEG_PPI$Gene1 !="-")
DEG_PPI <- subset(DEG_PPI, DEG_PPI$Gene2 !="-")

write.csv(DEG_PPI, "FinalPDPPI.csv", row.names = F, quote = F)

From this we have a network of 1367 interactions between 1034 proteins (file = PD_DEGPPI.R, DEG_PPI_ALP.csv, ALP_PPIgenes.txt. Gene names used).

Next I had to find the corresonding rows in each of the RNA expression data files ready for the correlation anaylysis. The LRRK2 samples were not included in the coexpression analysis because with only 2 and 3 patients respectively, it would results in rho values of 1/-1 or 1, 0.5, 0, -0.5 or 1, which are difficult to threshold. I also removed the MOR.FC dataset as the distribution of Rho values was abnormally distributed

[]

#### ALP####
#Read in network nodes
DEG_PPI <- readLines("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/ALSPARK2LRRK2/ALP_PPIGenes.txt")

setwd("/users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/")
#Extract PPI network genes from each dataset#####
LEW <- read.csv("LEWfilteredresult.csv")
rownames(LEW) <- LEW$Gene.Symbol
LEW <- LEW[,28:41]
LEW <- subset(LEW, rownames(LEW) %in% DEG_PPI)

MID3 <- read.csv("MID3filteredresult.csv")
rownames(MID3) <- MID3$Gene.Symbol
MID3 <- MID3[,36:49]
MID3 <- subset(MID3, rownames(MID3) %in% DEG_PPI)

MID4 <- read.csv("MID4filteredresult.csv")
rownames(MID4) <- MID4$Gene.Symbol
MID4 <- MID4[,41:55]
MID4 <- subset(MID4, rownames(MID4) %in% DEG_PPI)

MOR.FC <- read.csv("MOR.FCfilteredresult.csv")
rownames(MOR.FC) <- MOR.FC$Gene.Symbol
MOR.FC <- MOR.FC[,24:28]
MOR.FC <- subset(MOR.FC, rownames(MOR.FC) %in% DEG_PPI)

DIJ <- read.csv("DIJfilteredresult.csv")
rownames(DIJ) <- DIJ$Gene.Symbol
DIJ <- DIJ[,29:43]
DIJ <- subset(DIJ, rownames(DIJ) %in% DEG_PPI)

FFR <- read.csv("FFRfilteredresult.csv")
rownames(FFR) <- FFR$Gene.Symbol
FFR <- FFR[,29:44]
FFR <- subset(FFR, rownames(FFR) %in% DEG_PPI)

MID1 <- read.csv("MID1filteredresult.csv")
rownames(MID1) <- MID1$Gene.Symbol
MID1 <- MID1[,28:37]
MID1 <- subset(MID1, rownames(MID1) %in% DEG_PPI)

MID2 <- read.csv("MID2filteredresult.csv")
rownames(MID2) <- MID2$Gene.Symbol
MID2 <- MID2[,39:49]
MID2 <- subset(MID2, rownames(MID2) %in% DEG_PPI)

MOR.SN <- read.csv("MOR.SNfilteredresult.csv")
rownames(MOR.SN) <- MOR.SN$Gene.Symbol
MOR.SN <- MOR.SN[,36:59]
MOR.SN <- subset(MOR.SN, rownames(MOR.SN) %in% DEG_PPI)

DUM <- read.csv("DUM_UniqueGene_DESeq2.csv")
rownames(DUM) <- DUM$hgnc_symbol
DUM <- DUM[,53:81]
DUM <- subset(DUM, rownames(DUM) %in% DEG_PPI)




#Find the gene names that all datasets have in common
DEG_com <- Reduce(intersect, list(rownames(DIJ), rownames(FFR), 
                                  rownames(LEW),rownames(MID1),rownames(MID2),
                                  rownames(MID3),rownames(MID4),rownames(MOR.FC),
                                  rownames(MOR.SN), rownames(DUM)))

This script resulted in 854 common names. Correlation analysis was conducted on sharc (/shared/hidelab2/user/mdp15cmg/Parkinsons/PD_Cor)

When the correlation relationships were downloaded and analysed, only 59 edges reached a consistent correlation value of over 0.1 or below -0.1 regardless of sign. This increases to 208 when the RNA-seq dataset is removed.

Yet Another New Approach

Talking to Kat we hypothesised that the Parkin Fibroblasts were perhaps driving away the signal too strongly. Also the number of samples was so small that it’s not necessarily giving a representative signal.

To compensate, I decided to go back to the blood samples and use some other samples from the AMA GSE99039 dataset which have familial mutations. There were 5 ATP13A2 cases, 15 Parkin cases, and 12 PINK1 cases. Two of the Parkin cases in the metadata didn’t have corresponding files, so this became 13 Parkin cases.

Limma

##### AMA ATP13A2 #########
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Data/GSE99039_keepfiles/")

#run program to choose .CEL files from directory
celfiles <- fileBrowser(textToShow = "Choose CEL files", testFun = hasSuffix("[cC][eE][lL]"))
Data<-ReadAffy(filenames=celfiles) #read in files
rmaEset<-rma(Data) #normalise using RMA
analysis.name<-"AMA_ATP13A2" #Label analysis
dataMatrixAll<-exprs(rmaEset) #takes expression from normalised expression set


#mas5call generates presence/absence calls for each probeset
mas5call<-mas5calls(Data)
callMatrixAll<-exprs(mas5call)
colnames(callMatrixAll)<-sub(".CEL", ".mas5-Detection", colnames(callMatrixAll),fixed=TRUE)
colnames(callMatrixAll)<-sub(".cel", ".mas5-Detection", colnames(callMatrixAll),fixed=TRUE)
callMatrixAll<-as.data.frame(callMatrixAll)
callMatrixAll$ProbeSetID<-rownames(callMatrixAll)
countPf<-function(x){
  sum(x=="P")
}

#count how many samples have presence calls
countPl<-apply(callMatrixAll, 1, countPf)
callMatrixAll$ProbeSetID<-rownames(callMatrixAll)
countPdf<-data.frame(ProbeSetID=names(countPl), countP=countPl) 

#read annotation
# USING ANNOTATION FILE (if .csv, convert to .txt using excel)
annotation.file<-"/Users/clairegreen/Documents/PhD/TDP-43/TDP-43_Data/HG-U133_Plus_2.na35.annot.csv/HG-U133_Plus_2.na35_SHORT.annot.txt"
annotation<-read.table(annotation.file, header = TRUE, row.names=NULL, sep="\t", skip=0, stringsAsFactors=F, quote = "", comment.char="!", fill = TRUE, as.is = TRUE)
dim(annotation)
nrow(annotation)
annotation<-subset( annotation, subset=(Gene.Symbol !="---")) #if no gene symbol, discount

# Remove rows in which genes are noted to have negative strand matching probes
idxNegativeStrand<-grep("Negative Strand Matching Probes", annotation$Annotation.Notes)
if(length(idxNegativeStrand)>0)
{
  annotation<-annotation[-idxNegativeStrand,]
}


expressionMatrix<-exprs(rmaEset)
colnames(expressionMatrix)

#this is for matched samples
Treat<-factor(rep(c("Control", "Patient"),c(183,5)), levels=c("Control", "Patient"))
design<-model.matrix(~Treat)
rownames(design)<-colnames(expressionMatrix)
design

#Conduct statistical analysis of expression
library(limma)
fit<-lmFit(expressionMatrix, design) #linear model fit
fit<-eBayes(fit) 
result<-topTable(fit, coef="TreatPatient", adjust="BH", number=nrow(expressionMatrix)) #"BH" adjust for multiple hypothesis testing
#toptable normally takes top number but this takes all
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression")
write.csv(result, file = paste(analysis.name, "_result.csv", sep = ""))

result$"ProbeSetID"<-rownames(result) #make probeset IDs the row names
head(result$"ProbeSetID") 
result$"Fold Change"<-2^result$logFC 
result$"Fold Change"[result$"Fold Change"<1]<-(-1)/result$"Fold Change"[result$"Fold Change"<1] #converts log fold change into a linear value above or below 0
expressionLinear<-as.data.frame(2^expressionMatrix)
expressionLinear$ProbeSetID<-rownames(expressionLinear)
result<-merge(result, expressionLinear, by.x="ProbeSetID", by.y="ProbeSetID") #merge values into one array
result<-merge(annotation, result, by.x="Probe.Set.ID", by.y="ProbeSetID")
result<-merge(result, countPdf, by.x="Probe.Set.ID", by.y="ProbeSetID")
result$Gene.Symbol <- sapply(strsplit(result$Gene.Symbol,"///"), `[`, 1)
result$Ensembl <- sapply(strsplit(result$Ensembl,"///"), `[`, 1)

setwd(dir = "/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/")
write.csv(result, file=paste(analysis.name, "_finalresult.csv", sep=""), row.names=FALSE, quote = FALSE)

genesort <- result[order(result$P.Value),]
uniqueresult <- genesort[!duplicated(genesort[,5]),]
write.csv(uniqueresult, file=paste(analysis.name, "rankeduniqueresult.csv", sep=""), row.names=FALSE, quote = FALSE)

uniqueresult<-subset(uniqueresult, Gene.Symbol!="") #removes any probes for which there are no gene symbols
uniqueresult<-subset(uniqueresult, subset=(countP>20)) #only takes results that have at least 2 samples with a presence call for a probe
write.csv(uniqueresult, file=paste(analysis.name, "filteredresult.csv", sep=""), row.names=FALSE, quote = FALSE)

##### AMA PINK1 ##########
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Data/GSE99039_keepfiles/")

#run program to choose .CEL files from directory
celfiles <- fileBrowser(textToShow = "Choose CEL files", testFun = hasSuffix("[cC][eE][lL]"))
Data<-ReadAffy(filenames=celfiles) #read in files
rmaEset<-rma(Data) #normalise using RMA
analysis.name<-"AMA_PINK1" #Label analysis
dataMatrixAll<-exprs(rmaEset) #takes expression from normalised expression set


#mas5call generates presence/absence calls for each probeset
mas5call<-mas5calls(Data)
callMatrixAll<-exprs(mas5call)
colnames(callMatrixAll)<-sub(".CEL", ".mas5-Detection", colnames(callMatrixAll),fixed=TRUE)
colnames(callMatrixAll)<-sub(".cel", ".mas5-Detection", colnames(callMatrixAll),fixed=TRUE)
callMatrixAll<-as.data.frame(callMatrixAll)
callMatrixAll$ProbeSetID<-rownames(callMatrixAll)
countPf<-function(x){
  sum(x=="P")
}

#count how many samples have presence calls
countPl<-apply(callMatrixAll, 1, countPf)
callMatrixAll$ProbeSetID<-rownames(callMatrixAll)
countPdf<-data.frame(ProbeSetID=names(countPl), countP=countPl) 

#read annotation
# USING ANNOTATION FILE (if .csv, convert to .txt using excel)
annotation.file<-"/Users/clairegreen/Documents/PhD/TDP-43/TDP-43_Data/HG-U133_Plus_2.na35.annot.csv/HG-U133_Plus_2.na35_SHORT.annot.txt"
annotation<-read.table(annotation.file, header = TRUE, row.names=NULL, sep="\t", skip=0, stringsAsFactors=F, quote = "", comment.char="!", fill = TRUE, as.is = TRUE)
dim(annotation)
nrow(annotation)
annotation<-subset( annotation, subset=(Gene.Symbol !="---")) #if no gene symbol, discount

# Remove rows in which genes are noted to have negative strand matching probes
idxNegativeStrand<-grep("Negative Strand Matching Probes", annotation$Annotation.Notes)
if(length(idxNegativeStrand)>0)
{
  annotation<-annotation[-idxNegativeStrand,]
}


expressionMatrix<-exprs(rmaEset)
colnames(expressionMatrix)

#this is for matched samples
Treat<-factor(rep(c("Control", "Patient"),c(183,12)), levels=c("Control", "Patient"))
design<-model.matrix(~Treat)
rownames(design)<-colnames(expressionMatrix)
design

#Conduct statistical analysis of expression
library(limma)
fit<-lmFit(expressionMatrix, design) #linear model fit
fit<-eBayes(fit) 
result<-topTable(fit, coef="TreatPatient", adjust="BH", number=nrow(expressionMatrix)) #"BH" adjust for multiple hypothesis testing
#toptable normally takes top number but this takes all
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression")
write.csv(result, file = paste(analysis.name, "_result.csv", sep = ""))

result$"ProbeSetID"<-rownames(result) #make probeset IDs the row names
head(result$"ProbeSetID") 
result$"Fold Change"<-2^result$logFC 
result$"Fold Change"[result$"Fold Change"<1]<-(-1)/result$"Fold Change"[result$"Fold Change"<1] #converts log fold change into a linear value above or below 0
expressionLinear<-as.data.frame(2^expressionMatrix)
expressionLinear$ProbeSetID<-rownames(expressionLinear)
result<-merge(result, expressionLinear, by.x="ProbeSetID", by.y="ProbeSetID") #merge values into one array
result<-merge(annotation, result, by.x="Probe.Set.ID", by.y="ProbeSetID")
result<-merge(result, countPdf, by.x="Probe.Set.ID", by.y="ProbeSetID")
result$Gene.Symbol <- sapply(strsplit(result$Gene.Symbol,"///"), `[`, 1)
result$Ensembl <- sapply(strsplit(result$Ensembl,"///"), `[`, 1)

setwd(dir = "/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/")
write.csv(result, file=paste(analysis.name, "_finalresult.csv", sep=""), row.names=FALSE, quote = FALSE)

genesort <- result[order(result$P.Value),]
uniqueresult <- genesort[!duplicated(genesort[,5]),]
write.csv(uniqueresult, file=paste(analysis.name, "rankeduniqueresult.csv", sep=""), row.names=FALSE, quote = FALSE)

uniqueresult<-subset(uniqueresult, Gene.Symbol!="") #removes any probes for which there are no gene symbols
uniqueresult<-subset(uniqueresult, subset=(countP>20)) #only takes results that have at least 2 samples with a presence call for a probe
write.csv(uniqueresult, file=paste(analysis.name, "filteredresult.csv", sep=""), row.names=FALSE, quote = FALSE)

##### AMA PARKIN ##########
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Data/GSE99039_keepfiles/")

#run program to choose .CEL files from directory
celfiles <- fileBrowser(textToShow = "Choose CEL files", testFun = hasSuffix("[cC][eE][lL]"))
Data<-ReadAffy(filenames=celfiles) #read in files
rmaEset<-rma(Data) #normalise using RMA
analysis.name<-"AMA_PARKIN" #Label analysis
dataMatrixAll<-exprs(rmaEset) #takes expression from normalised expression set


#mas5call generates presence/absence calls for each probeset
mas5call<-mas5calls(Data)
callMatrixAll<-exprs(mas5call)
colnames(callMatrixAll)<-sub(".CEL", ".mas5-Detection", colnames(callMatrixAll),fixed=TRUE)
colnames(callMatrixAll)<-sub(".cel", ".mas5-Detection", colnames(callMatrixAll),fixed=TRUE)
callMatrixAll<-as.data.frame(callMatrixAll)
callMatrixAll$ProbeSetID<-rownames(callMatrixAll)
countPf<-function(x){
  sum(x=="P")
}

#count how many samples have presence calls
countPl<-apply(callMatrixAll, 1, countPf)
callMatrixAll$ProbeSetID<-rownames(callMatrixAll)
countPdf<-data.frame(ProbeSetID=names(countPl), countP=countPl) 

#read annotation
# USING ANNOTATION FILE (if .csv, convert to .txt using excel)
annotation.file<-"/Users/clairegreen/Documents/PhD/TDP-43/TDP-43_Data/HG-U133_Plus_2.na35.annot.csv/HG-U133_Plus_2.na35_SHORT.annot.txt"
annotation<-read.table(annotation.file, header = TRUE, row.names=NULL, sep="\t", skip=0, stringsAsFactors=F, quote = "", comment.char="!", fill = TRUE, as.is = TRUE)
dim(annotation)
nrow(annotation)
annotation<-subset( annotation, subset=(Gene.Symbol !="---")) #if no gene symbol, discount

# Remove rows in which genes are noted to have negative strand matching probes
idxNegativeStrand<-grep("Negative Strand Matching Probes", annotation$Annotation.Notes)
if(length(idxNegativeStrand)>0)
{
  annotation<-annotation[-idxNegativeStrand,]
}


expressionMatrix<-exprs(rmaEset)
colnames(expressionMatrix)

#this is for matched samples
Treat<-factor(rep(c("Control", "Patient"),c(183,13)), levels=c("Control", "Patient"))
design<-model.matrix(~Treat)
rownames(design)<-colnames(expressionMatrix)
design

#Conduct statistical analysis of expression
library(limma)
fit<-lmFit(expressionMatrix, design) #linear model fit
fit<-eBayes(fit) 
result<-topTable(fit, coef="TreatPatient", adjust="BH", number=nrow(expressionMatrix)) #"BH" adjust for multiple hypothesis testing
#toptable normally takes top number but this takes all
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression")
write.csv(result, file = paste(analysis.name, "_result.csv", sep = ""))

result$"ProbeSetID"<-rownames(result) #make probeset IDs the row names
head(result$"ProbeSetID") 
result$"Fold Change"<-2^result$logFC 
result$"Fold Change"[result$"Fold Change"<1]<-(-1)/result$"Fold Change"[result$"Fold Change"<1] #converts log fold change into a linear value above or below 0
expressionLinear<-as.data.frame(2^expressionMatrix)
expressionLinear$ProbeSetID<-rownames(expressionLinear)
result<-merge(result, expressionLinear, by.x="ProbeSetID", by.y="ProbeSetID") #merge values into one array
result<-merge(annotation, result, by.x="Probe.Set.ID", by.y="ProbeSetID")
result<-merge(result, countPdf, by.x="Probe.Set.ID", by.y="ProbeSetID")
result$Gene.Symbol <- sapply(strsplit(result$Gene.Symbol,"///"), `[`, 1)
result$Ensembl <- sapply(strsplit(result$Ensembl,"///"), `[`, 1)

setwd(dir = "/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/")
write.csv(result, file=paste(analysis.name, "_finalresult.csv", sep=""), row.names=FALSE, quote = FALSE)

genesort <- result[order(result$P.Value),]
uniqueresult <- genesort[!duplicated(genesort[,5]),]
write.csv(uniqueresult, file=paste(analysis.name, "rankeduniqueresult.csv", sep=""), row.names=FALSE, quote = FALSE)

uniqueresult<-subset(uniqueresult, Gene.Symbol!="") #removes any probes for which there are no gene symbols
uniqueresult<-subset(uniqueresult, subset=(countP>20)) #only takes results that have at least 2 samples with a presence call for a probe
write.csv(uniqueresult, file=paste(analysis.name, "filteredresult.csv", sep=""), row.names=FALSE, quote = FALSE)

Common Differentially Expressed Genes (Blood Edition)

#############################################################
################### FOLD CHANGE DEG #########################
#############################################################

setwd("/users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/")

LEW <- read.csv("LEWfilteredresult.csv")
LEW <- LEW[order(LEW$P.Value),]
MID3 <- read.csv("MID3filteredresult.csv")
MID3 <- MID3[order(MID3$P.Value),]
MID4 <- read.csv("MID4filteredresult.csv")
MID4 <- MID4[order(MID4$P.Value),]
MOR.FC <- read.csv("MOR.FCfilteredresult.csv")
MOR.FC <- MOR.FC[order(MOR.FC$P.Value),]
DIJ <- read.csv("DIJfilteredresult.csv")
DIJ <- DIJ[order(DIJ$P.Value),]
FFR <- read.csv("FFRfilteredresult.csv")
FFR <- FFR[order(FFR$P.Value),]
MID1 <- read.csv("MID1filteredresult.csv")
MID1 <- MID1[order(MID1$P.Value),]
MID2 <- read.csv("MID2filteredresult.csv")
MID2 <- MID2[order(MID2$P.Value),]
MOR.SN <- read.csv("MOR.SNfilteredresult.csv")
MOR.SN <- MOR.SN[order(MOR.SN$P.Value),]
DUM <- read.csv("DUM_UniqueGene_DESeq2.csv")
DUM <- DUM[order(DUM$pvalue),]
BOT <- read.csv("BOTrankeduniqueresult.csv")
BOT <- BOT[order(BOT$P.Value),]
BOT2 <- read.csv("BOT2rankeduniqueresult.csv")
BOT2 <- BOT2[order(BOT2$P.Value),]




thresh <- 1

upLEW <- subset(LEW, LEW$Fold.Change >= thresh)
upLEWgene <- upLEW$Gene.Symbol

upMID3<- subset(MID3, MID3$Fold.Change >= thresh)
upMID3gene <- upMID3$Gene.Symbol

upMID4 <- subset(MID4, MID4$Fold.Change >= thresh)
upMID4gene <- upMID4$Gene.Symbol

upMOR.FC <- subset(MOR.FC, MOR.FC$Fold.Change >= thresh)
upMOR.FCgene <- upMOR.FC$Gene.Symbol

upDUM <- subset(DUM, DUM$log2FoldChange >= 0)
upDUMgene <- upDUM$hgnc_symbol

upDIJ <- subset(DIJ, DIJ$Fold.Change >= thresh)
upDIJgene <- upDIJ$Gene.Symbol

upFFR<- subset(FFR, FFR$Fold.Change >= thresh)
upFFRgene <- upFFR$Gene.Symbol

upMID1<- subset(MID1, MID1$Fold.Change >= thresh)
upMID1gene <- upMID1$Gene.Symbol

upMID2 <- subset(MID2, MID2$Fold.Change >= thresh)
upMID2gene <- upMID2$Gene.Symbol

upMOR.SN <- subset(MOR.SN, MOR.SN$Fold.Change >= thresh)
upMOR.SNgene <- upMOR.SN$Gene.Symbol

upBOT <- subset(BOT, BOT$Fold.Change >= thresh)
upBOTgene <- upBOT$Gene.Symbol

upBOT2 <- subset(BOT2, BOT2$Fold.Change >= thresh)
upBOT2gene <- upBOT2$Gene.Symbol


INTUP <- Reduce(intersect, list(upLEWgene, upMID3gene, upMID4gene, upMOR.FCgene,
                                upDIJgene, upFFRgene, upMID1gene, upMID2gene, upMOR.SNgene, upDUMgene, upBOTgene, upBOT2gene))



#### DOWN ####
thresh <- -1

downLEW <- subset(LEW, LEW$Fold.Change <= thresh)
downLEWgene <- downLEW$Gene.Symbol

downMID3 <- subset(MID3, MID3$Fold.Change <= thresh)
downMID3gene <- downMID3$Gene.Symbol

downMID4 <- subset(MID4, MID4$Fold.Change <= thresh)
downMID4gene <- downMID4$Gene.Symbol

downMOR.FC <- subset(MOR.FC, MOR.FC$Fold.Change <= thresh)
downMOR.FCgene <- downMOR.FC$Gene.Symbol

downDUM <- subset(DUM, DUM$log2FoldChange <= 0)
downDUMgene <- downDUM$hgnc_symbol

downDIJ <- subset(DIJ, DIJ$Fold.Change <= thresh)
downDIJgene <- downDIJ$Gene.Symbol

downFFR<- subset(FFR, FFR$Fold.Change <= thresh)
downFFRgene <- downFFR$Gene.Symbol

downMID1<- subset(MID1, MID1$Fold.Change <= thresh)
downMID1gene <- downMID1$Gene.Symbol

downMID2 <- subset(MID2, MID2$Fold.Change <= thresh)
downMID2gene <- downMID2$Gene.Symbol

downMOR.SN <- subset(MOR.SN, MOR.SN$Fold.Change <= thresh)
downMOR.SNgene <- downMOR.SN$Gene.Symbol


downBOT <- subset(BOT, BOT$Fold.Change <= thresh)
downBOTgene <- downBOT$Gene.Symbol

downBOT2 <- subset(BOT2, BOT2$Fold.Change <= thresh)
downBOT2gene <- downBOT2$Gene.Symbol

INTDOWN <- Reduce(intersect, list(downLEWgene, downMID3gene, downMID4gene, downMOR.FCgene,
                                  downDIJgene, downFFRgene, downMID1gene, downMID2gene, 
                                  downMOR.SNgene, downDUMgene, downBOTgene, downBOT2gene))




########################### COMMON GENES ##############################
all <- c(INTUP, INTDOWN)
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/FoldChange")
# write.table(INTUP,"Sp_LRRK2_upDEGs.txt", row.names = F, col.names = F, quote = F)
# write.table(INTDOWN,"Sp_LRRK2_downDEGs.txt", row.names = F, col.names = F, quote = F)
# write.table(all, "Sp_LRRK2_allDEGs.txt", row.names = F, col.names = F, quote = F)

PDgenes <- readLines("/Users/clairegreen/Documents/PhD/Parkinsons/ParkinsonsDiseaseMalacards.txt")

intersect(INTUP, PDgenes)
intersect(INTDOWN, PDgenes)



####### ALS signature ##########

setwd("/users/clairegreen/Documents/PhD/TDP-43/TDP-43_Code/Results/GeneExpression/noMedian/")

C9 <- read.csv("C9_unique.csv")
C9 <- C9[order(C9$P.Value),]
sals <- read.csv("sals_unique.csv")
sals <- sals[order(sals$P.Value),]

setwd("/users/clairegreen/Documents/PhD/TDP-43/TDP-43_Code/Results/GeneExpression/TDP-43_DEseq2/")

pet <- read.csv("PET_results_keepfiltering.csv")
rav <- read.csv("RAV_results_keepfiltering.csv")

setwd("/users/clairegreen/Documents/PhD/TDP-43/TDP-43_Code/Results/GeneExpression/non-TDP/")

FUS <- read.csv("FUSrankeduniqueresult.csv")
FUS <- FUS[order(FUS$P.Value),]
SOD1 <- read.csv("SOD1rankeduniqueresult.csv")
SOD1 <- SOD1[order(SOD1$P.Value),]


thresh <- 1

upC9 <- subset(C9, C9$Fold.Change >= thresh)
upC9gene <- upC9$Gene.Symbol

upSALS <- subset(sals, sals$Fold.Change >= thresh)
upSALSgene <- upSALS$Gene.Symbol

upPET <- subset(pet, pet$FoldChange >= thresh)
upPETgene <- upPET$hgnc_symbol

upRAV <- subset(rav, rav$FoldChange >= thresh)
upRAVgene <- upRAV$hgnc_symbol

upFUS <- subset(FUS, FUS$Fold.Change >= thresh)
upFUSgene <- upFUS$Gene.Symbol

upSOD1 <- subset(SOD1, SOD1$Fold.Change >= thresh)
upSOD1gene <- upSOD1$Gene.Symbol

INTUP_ALS <- Reduce(intersect, list(upC9gene, upSALSgene, upPETgene, upRAVgene, upFUSgene, upSOD1gene))


#### DOWN 
thresh <- -1

downC9 <- subset(C9, C9$Fold.Change <= thresh)
downC9gene <- downC9$Gene.Symbol

downSALS <- subset(sals, sals$Fold.Change <= thresh)
downSALSgene <- downSALS$Gene.Symbol

downPET <- subset(pet, pet$FoldChange <= thresh)
downPETgene <- downPET$hgnc_symbol

downRAV <- subset(rav, rav$FoldChange <= thresh)
downRAVgene <- downRAV$hgnc_symbol

downFUS <- subset(FUS, FUS$Fold.Change <= thresh)
downFUSgene <- downFUS$Gene.Symbol

downSOD1 <- subset(SOD1, SOD1$Fold.Change <= thresh)
downSOD1gene <- downSOD1$Gene.Symbol

INTDOWN_ALS <- Reduce(intersect, list(downC9gene, downSALSgene, downPETgene, downRAVgene, downFUSgene, downSOD1gene))



##### COMMON GENES ###
upremove <- Reduce(intersect, list (INTUP, INTUP_ALS))
downremove <- Reduce(intersect, list(INTDOWN, INTDOWN_ALS))



###### REMOVE COMMON GENES ###
resultsup <- subset(INTUP, !(INTUP %in% upremove))
resultsdown <- subset(INTDOWN, !(INTDOWN %in% downremove))
results <- c(resultsup, resultsdown)

####### sPD Blood signature ##########

setwd("/users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/")

AMA <- read.csv("AMAfilteredresult.csv")
AMA <- AMA[order(AMA$P.Value),]
RON <- read.csv("RONfilteredresult.csv")
RON <- RON[order(RON$P.Value),]


thresh <- 1

upAMA <- subset(AMA, AMA$Fold.Change >= thresh)
upAMAgene <- upAMA$Gene.Symbol

upRON <- subset(RON, RON$Fold.Change >= thresh)
upRONgene <- upRON$Gene.Symbol


INTUP_blood <- Reduce(intersect, list(upAMAgene, upRONgene))


#### DOWN ###
thresh <- -1

downAMA <- subset(AMA, AMA$Fold.Change <= thresh)
downAMAgene <- downAMA$Gene.Symbol

downRON <- subset(RON, RON$Fold.Change <= thresh)
downRONgene <- downRON$Gene.Symbol


INTDOWN_blood <- Reduce(intersect, list(downAMAgene, downRONgene))



##### COMMON GENES ###
upremove2 <- Reduce(intersect, list(INTUP, INTUP_blood))
downremove2 <- Reduce(intersect, list(INTDOWN, INTDOWN_blood))



##### REMOVE COMMON GENES ###
resultsup <- subset(resultsup, !(resultsup %in% upremove2))
resultsdown <- subset(resultsdown, !(resultsdown %in% downremove2))
results <- c(resultsup, resultsdown)




setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/FoldChange/")
# write.table(resultsup, "Sp_LRRK2_FiltALSPDblood_UPgenes.txt", quote = F, row.names = F, col.names = F)
# write.table(resultsdown, "Sp_LRRK2_FiltALSPDblood_DOWNgenes.txt", quote = F, row.names = F, col.names = F)
# write.table(results, "Sp_LRRK2_FiltALSPDblood_ALLgenes.txt", quote = F, row.names = F, col.names = F)
# cat(resultsup, sep="\n")

intersect(resultsup, PDgenes)
intersect(resultsdown, PDgenes)


####### fPD Blood signature ##########

setwd("/users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/")

AMA_A <- read.csv("AMA_ATP13A2filteredresult.csv")
AMA_A <- AMA_A[order(AMA_A$P.Value),]
AMA_PARK <- read.csv("AMA_PARKINfilteredresult.csv")
AMA_PARK <- AMA_PARK[order(AMA_PARK$P.Value),]
AMA_PINK <- read.csv("AMA_PINK1filteredresult.csv")
AMA_PINK <- AMA_PINK[order(AMA_PINK$P.Value),]


thresh <- 1

upAMA_A <- subset(AMA_A, AMA_A$Fold.Change >= thresh)
upAMA_Agene <- AMA_A$Gene.Symbol

upAMA_PARK<- subset(AMA_PARK, AMA_PARK$Fold.Change >= thresh)
upAMA_PARKgene <- upAMA_PARK$Gene.Symbol

upAMA_PINK<- subset(AMA_PINK, AMA_PINK$Fold.Change >= thresh)
upAMA_PINKgene <- upAMA_PINK$Gene.Symbol


INTUP_fam_blood <- Reduce(intersect, list(upAMA_Agene, upAMA_PARKgene, upAMA_PINKgene))


#### DOWN ###
thresh <- -1

downAMA_A <- subset(AMA_A, AMA_A$Fold.Change <= thresh)
downAMA_Agene <- AMA_A$Gene.Symbol

downAMA_PARK<- subset(AMA_PARK, AMA_PARK$Fold.Change <= thresh)
downAMA_PARKgene <- downAMA_PARK$Gene.Symbol

downAMA_PINK<- subset(AMA_PINK, AMA_PINK$Fold.Change <= thresh)
downAMA_PINKgene <- downAMA_PINK$Gene.Symbol


INTDOWN_fam_blood <- Reduce(intersect, list(downAMA_Agene, downAMA_PARKgene, downAMA_PINKgene))



##### COMMON GENES ###
upremove3 <- Reduce(intersect, list(INTUP, INTUP_fam_blood))
downremove3 <- Reduce(intersect, list(INTDOWN, INTDOWN_fam_blood))



##### REMOVE COMMON GENES ###
resultsup <- subset(resultsup, !(resultsup %in% upremove3))
resultsdown <- subset(resultsdown, !(resultsdown %in% downremove3))
results <- c(resultsup, resultsdown)




setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/FamilialBlood/")
write.table(resultsup, "ALS_sfblood_UPgenes.txt", quote = F, row.names = F, col.names = F)
write.table(resultsdown, "ALS_sfblood_DOWNgenes.txt", quote = F, row.names = F, col.names = F)
write.table(results, "ALS_sfblood_ALLgenes.txt", quote = F, row.names = F, col.names = F)
cat(resultsup, sep="\n")

intersect(resultsup, PDgenes)
intersect(resultsdown, PDgenes)

This produced 170 commonly differentially expressed genes (ALS_sfblood_ALLgenes.txt). These genes included Malacards SCNA, MSX1, UCHL1, ALDH1A1 .

Protein Interaction Expansion

The 170 gene seed created a network of 2159 proteins (/familialblood/PPIgenes). EnrichR (http://amp.pharm.mssm.edu/Enrichr/enrich?dataset=3ztlg) shows high enrichment for a number of processes, making it hard to know exactly what the geneset represents. However for GO Cellular component the highest enrichments are still for the mitochondria

In this case, I am somewhat confident that I am looking in the right area. The only way we will know if it is, however, is to look at whether there is common coexpression

Common Coexpression in LRRK2/Sporadic PD

#### Parkinson's DEG PPI Correlation ####

#Read in network nodes

#### Familial Blood ####
#Read in network nodes
DEG_PPI <- readLines("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/FamilialBlood/PPIGenes.txt")

setwd("/users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/GeneExpression/")
#Extract PPI network genes from each dataset#####
LEW <- read.csv("LEWfilteredresult.csv")
rownames(LEW) <- LEW$Gene.Symbol
LEW <- LEW[,28:41]
LEW <- subset(LEW, rownames(LEW) %in% DEG_PPI)

MID3 <- read.csv("MID3filteredresult.csv")
rownames(MID3) <- MID3$Gene.Symbol
MID3 <- MID3[,36:49]
MID3 <- subset(MID3, rownames(MID3) %in% DEG_PPI)

MID4 <- read.csv("MID4filteredresult.csv")
rownames(MID4) <- MID4$Gene.Symbol
MID4 <- MID4[,41:55]
MID4 <- subset(MID4, rownames(MID4) %in% DEG_PPI)

MOR.FC <- read.csv("MOR.FCfilteredresult.csv")
rownames(MOR.FC) <- MOR.FC$Gene.Symbol
MOR.FC <- MOR.FC[,24:28]
MOR.FC <- subset(MOR.FC, rownames(MOR.FC) %in% DEG_PPI)

DIJ <- read.csv("DIJfilteredresult.csv")
rownames(DIJ) <- DIJ$Gene.Symbol
DIJ <- DIJ[,29:43]
DIJ <- subset(DIJ, rownames(DIJ) %in% DEG_PPI)

FFR <- read.csv("FFRfilteredresult.csv")
rownames(FFR) <- FFR$Gene.Symbol
FFR <- FFR[,29:44]
FFR <- subset(FFR, rownames(FFR) %in% DEG_PPI)

MID1 <- read.csv("MID1filteredresult.csv")
rownames(MID1) <- MID1$Gene.Symbol
MID1 <- MID1[,28:37]
MID1 <- subset(MID1, rownames(MID1) %in% DEG_PPI)

MID2 <- read.csv("MID2filteredresult.csv")
rownames(MID2) <- MID2$Gene.Symbol
MID2 <- MID2[,39:49]
MID2 <- subset(MID2, rownames(MID2) %in% DEG_PPI)

MOR.SN <- read.csv("MOR.SNfilteredresult.csv")
rownames(MOR.SN) <- MOR.SN$Gene.Symbol
MOR.SN <- MOR.SN[,36:59]
MOR.SN <- subset(MOR.SN, rownames(MOR.SN) %in% DEG_PPI)

DUM <- read.csv("DUM_UniqueGene_DESeq2.csv")
rownames(DUM) <- DUM$hgnc_symbol
DUM <- DUM[,53:81]
DUM <- subset(DUM, rownames(DUM) %in% DEG_PPI)




#Find the gene names that all datasets have in common
DEG_com <- Reduce(intersect, list(rownames(DIJ), rownames(FFR), 
                                  rownames(LEW),rownames(MID1),rownames(MID2),
                                  rownames(MID3),rownames(MID4),rownames(MOR.FC),
                                  rownames(MOR.SN), rownames(DUM)))

#Subset each dataset with these common names so they are all the same size
DIJ <- subset(DIJ, rownames(DIJ) %in% DEG_com)
DUM <- subset(DUM, rownames(DUM) %in% DEG_com)
FFR <- subset(FFR, rownames(FFR) %in% DEG_com)
LEW <- subset(LEW, rownames(LEW) %in% DEG_com)
MID1 <- subset(MID1, rownames(MID1) %in% DEG_com)
MID2 <- subset(MID2, rownames(MID2) %in% DEG_com)
MID3 <- subset(MID3, rownames(MID3) %in% DEG_com)
MID4 <- subset(MID4, rownames(MID4) %in% DEG_com)
MOR.FC <- subset(MOR.FC, rownames(MOR.FC) %in% DEG_com)
MOR.SN <- subset(MOR.SN, rownames(MOR.SN) %in% DEG_com)

setwd("/users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/FamilialBlood")

write.csv(DIJ, "DIJ_DEGfilt.csv")
write.csv(DUM, "DUM_DEGfilt.csv")
write.csv(FFR, "FFR_DEGfilt.csv")
write.csv(LEW, "LEW_DEGfilt.csv")
write.csv(MID1, "MID1_DEGfilt.csv")
write.csv(MID2, "MID2_DEGfilt.csv")
write.csv(MID3, "MID3_DEGfilt.csv")
write.csv(MID4, "MID4_DEGfilt.csv")
write.csv(MOR.FC, "MOR.FC_DEGfilt.csv")
write.csv(MOR.SN, "MOR.SN_DEGfilt.csv")

#Run Correlation analysis on Sharc


#### filter correlations #### BOT/BOT2 not included due to less than 4 samples
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/FamilialBlood/")

DIJ <- read.csv("DIJcorresult.csv")
DIJ$Gene1 <- as.character(lapply(strsplit(as.character(DIJ$X), "\\:"), "[", 2))
DIJ$Gene2 <- as.character(lapply(strsplit(as.character(DIJ$X), "\\:"), "[", 1))
DIJ <- DIJ[,c(5,6,1,2,3,4)]

DUM <- read.csv("DUMcorresult.csv")
DUM$Gene1 <- as.character(lapply(strsplit(as.character(DUM$X), "\\:"), "[", 2))
DUM$Gene2 <- as.character(lapply(strsplit(as.character(DUM$X), "\\:"), "[", 1))
DUM <- DUM[,c(5,6,1,2,3,4)]
DUM$X <- paste(DUM$Gene1,":",DUM$Gene2, sep = "")

FFR <- read.csv("FFRcorresult.csv")
FFR$Gene1 <- as.character(lapply(strsplit(as.character(FFR$X), "\\:"), "[", 2))
FFR$Gene2 <- as.character(lapply(strsplit(as.character(FFR$X), "\\:"), "[", 1))
FFR <- FFR[,c(5,6,1,2,3,4)]
FFR$X <- paste(FFR$Gene1,":",FFR$Gene2, sep = "")

LEW <- read.csv("LEWcorresult.csv")
LEW$Gene1 <- as.character(lapply(strsplit(as.character(LEW$X), "\\:"), "[", 2))
LEW$Gene2 <- as.character(lapply(strsplit(as.character(LEW$X), "\\:"), "[", 1))
LEW <- LEW[,c(5,6,1,2,3,4)]

MID1 <- read.csv("MID1corresult.csv")
MID1$Gene1 <- as.character(lapply(strsplit(as.character(MID1$X), "\\:"), "[", 2))
MID1$Gene2 <- as.character(lapply(strsplit(as.character(MID1$X), "\\:"), "[", 1))
MID1 <- MID1[,c(5,6,1,2,3,4)]
MID1$X <- paste(MID1$Gene1,":",MID1$Gene2, sep = "")

MID2 <- read.csv("MID2corresult.csv")
MID2$Gene1 <- as.character(lapply(strsplit(as.character(MID2$X), "\\:"), "[", 2))
MID2$Gene2 <- as.character(lapply(strsplit(as.character(MID2$X), "\\:"), "[", 1))
MID2 <- MID2[,c(5,6,1,2,3,4)]

MID3 <- read.csv("MID3corresult.csv")
MID3$Gene1 <- as.character(lapply(strsplit(as.character(MID3$X), "\\:"), "[", 2))
MID3$Gene2 <- as.character(lapply(strsplit(as.character(MID3$X), "\\:"), "[", 1))
MID3 <- MID3[,c(5,6,1,2,3,4)]

MID4 <- read.csv("MID4corresult.csv")
MID4$Gene1 <- as.character(lapply(strsplit(as.character(MID4$X), "\\:"), "[", 2))
MID4$Gene2 <- as.character(lapply(strsplit(as.character(MID4$X), "\\:"), "[", 1))
MID4 <- MID4[,c(5,6,1,2,3,4)]
MID4$X <- paste(MID4$Gene1,":",MID4$Gene2, sep = "")

MOR.FC<- read.csv("MOR.FCcorresult.csv")
MOR.FC$Gene1 <- as.character(lapply(strsplit(as.character(MOR.FC$X), "\\:"), "[", 2))
MOR.FC$Gene2 <- as.character(lapply(strsplit(as.character(MOR.FC$X), "\\:"), "[", 1))
MOR.FC <- MOR.FC[,c(5,6,1,2,3,4)]

MOR.SN<- read.csv("MOR.SNcorresult.csv")
MOR.SN$Gene1 <- as.character(lapply(strsplit(as.character(MOR.SN$X), "\\:"), "[", 2))
MOR.SN$Gene2 <- as.character(lapply(strsplit(as.character(MOR.SN$X), "\\:"), "[", 1))
MOR.SN <- MOR.SN[,c(5,6,1,2,3,4)]

thresh <- 0.1
### Filter by r value
DIJ_cor.5 <- DIJ[DIJ$reg.mat > thresh | DIJ$reg.mat < -thresh,]
# DUM_cor.5 <- DUM[DUM$reg.mat > thresh | DUM$reg.mat < -thresh,]
FFR_cor.5 <- FFR[FFR$reg.mat > thresh | FFR$reg.mat < -thresh,]
LEW_cor.5 <- LEW[LEW$reg.mat > thresh | LEW$reg.mat < -thresh,]
MID1_cor.5 <- MID1[MID1$reg.mat > thresh | MID1$reg.mat < -thresh,]
MID2_cor.5 <- MID2[MID2$reg.mat > thresh | MID2$reg.mat < -thresh,]
MID3_cor.5 <- MID3[MID3$reg.mat > thresh | MID3$reg.mat < -thresh,]
MID4_cor.5 <- MID4[MID4$reg.mat > thresh | MID4$reg.mat < -thresh,]
# MOR.FC_cor.5 <- MOR.FC[MOR.FC$reg.mat > thresh | MOR.FC$reg.mat < -thresh,]
MOR.SN_cor.5 <- MOR.SN[MOR.SN$reg.mat > thresh | MOR.SN$reg.mat < -thresh,]


### Find matches 

Commonedge <- Reduce(intersect, list(DIJ_cor.5$X,FFR_cor.5$X,
                                     LEW_cor.5$X, MID1_cor.5$X, MID2_cor.5$X,
                                     MID3_cor.5$X, MID4_cor.5$X,
                                     MOR.SN_cor.5$X))


#Subset each dataset with these common names so they are all the same size
DIJ_CE <- subset(DIJ_cor.5, DIJ_cor.5$X %in% Commonedge)
DIJ_CE <- DIJ_CE[order(DIJ_CE$X),]
# DUM_CE <- subset(DUM_cor.5, DUM_cor.5$X %in% Commonedge)
# DUM_CE <- DUM_CE[order(DUM_CE$X),]
FFR_CE <- subset(FFR_cor.5, FFR_cor.5$X %in% Commonedge)
FFR_CE <- FFR_CE[order(FFR_CE$X),]
LEW_CE <- subset(LEW_cor.5, LEW_cor.5$X %in% Commonedge)
LEW_CE <- LEW_CE[order(LEW_CE$X),]
MID1_CE <- subset(MID1_cor.5, MID1_cor.5$X %in% Commonedge)
MID1_CE <- MID1_CE[order(MID1_CE$X),]
MID2_CE <- subset(MID2_cor.5, MID2_cor.5$X %in% Commonedge)
MID2_CE <- MID2_CE[order(MID2_CE$X),]
MID3_CE <- subset(MID3_cor.5, MID3_cor.5$X %in% Commonedge)
MID3_CE <- MID3_CE[order(MID3_CE$X),]
MID4_CE <- subset(MID4_cor.5, MID4_cor.5$X %in% Commonedge)
MID4_CE <- MID4_CE[order(MID4_CE$X),]
# MOR.FC_CE <- subset(MOR.FC_cor.5, MOR.FC_cor.5$X %in% Commonedge)
# MOR.FC_CE <- MOR.FC_CE[order(MOR.FC_CE$X),]
MOR.SN_CE <- subset(MOR.SN_cor.5, MOR.SN_cor.5$X %in% Commonedge)
MOR.SN_CE <- MOR.SN_CE[order(MOR.SN_CE$X),]



CommonGroup <- data.frame(row.names = DIJ_CE$X,
                          DIJ = DIJ_CE$reg.mat,
                          FFR = FFR_CE$reg.mat,
                          LEW = LEW_CE$reg.mat,
                          MID1 = MID1_CE$reg.mat,
                          MID2 = MID2_CE$reg.mat, 
                          MID3 = MID3_CE$reg.mat, 
                          MID4 = MID4_CE$reg.mat, 
                          MOR.SN = MOR.SN_CE$reg.mat)


CG_conserved_up <- CommonGroup[apply(CommonGroup, MARGIN = 1, function(x) all(x > 0)), ]
CG_conserved_down <- CommonGroup[apply(CommonGroup, MARGIN = 1, function(x) all(x < 0)), ]

setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/FamilialBlood/")
CG_samedir <- rbind(CG_conserved_up, CG_conserved_down)
CG_samedir$corMean <- rowMeans(CG_samedir, na.rm = FALSE, dims = 1)
CG_samedir$Gene <- rownames(CG_samedir)
CG_samedir$Gene1 <- as.character(lapply(strsplit(as.character(CG_samedir$Gene), "\\:"), "[", 2))
CG_samedir$Gene2 <- as.character(lapply(strsplit(as.character(CG_samedir$Gene), "\\:"), "[", 1))

write.csv(CG_samedir, "PD_samedir_1.csv", quote = F, row.names = F)

nodetable <- read.csv("PD_samedir_1node.csv")
nuggenes <- as.character(nodetable$name)
PDgenes <- readLines("/Users/clairegreen/Documents/PhD/Parkinsons/ParkinsonsDiseaseMalacards.txt")
celltype <- read.csv("~/Documents/PhD/TDP-43/TDP-43_Code/Results/PPI_Network/Zhang_BrainCelltype_Markers_braingenes.csv")
DEG <- readLines("ALS_sfblood_ALLgenes.txt")
nalls <- readLines("/Users/clairegreen/Documents/PhD/Parkinsons/NallsPDGWAS.txt")
targetvalPD <- c("SIRT2", "DNM1L", "STMN1", "DNAJA3", "TBK1", "RTCA", "ANXA7", "DNAJC12", "RTN4", "ADSL", "MDH1","ATP6V1G2", 
                 "YWHAZ", "ETS1")

nodetable$PDMalacards <- nodetable$shared.name %in% PDgenes
nodetable$DEG <- nodetable$name %in% DEG
nodetable$targetvalPD <- nodetable$shared.name %in% targetvalPD
nodetable$targetvalPD <- nodetable$shared.name %in% nalls

nodetable_celltype <- merge(celltype, nodetable, by.x = "Gene.symbol",  by.y = "shared.name", all = T)
nodetable_celltype <- subset(nodetable_celltype, !(nodetable_celltype$SUID == "NA"))


nodetable_celltype$targetvalPD <- nodetable_celltype$Gene.symbol %in% targetvalPD
write.csv(nodetable_celltype, "ModifiedPDNodetable.csv", row.names = F)

This analysis didn’t go as well as the TDP-43 case. The correlations were much more variable meaning I had to reduce the thresholds from 0.5 to 0.1. Additionally, I removed four datasets - the LRRK2 datasets couldn’t be used because they only had 2 and 3 samples - correlation would never calculated properly. DUM was an RNA-seq dataset so I was losing annotation coverage, and MOR.FC showed a really odd rho value distribution…

Looking back this is probably because it only has 3 control samples.

Using these 8 datasets, with a threshold of 0.1, This generated a network containing 279 nodes and 292 edges. Within this network was one clear largest connected component of 208 nodes and 249 edges.

If I increase the threshold to 0.2, the resulting network is much smaller:

Functional Enrichment

http://amp.pharm.mssm.edu/Enrichr/enrich?dataset=3zurx

The enrichments suggest that there is an immune process signal. There is also still an enrichment in the mitochondria for GO Cellular Component.

Rerunning after removal of LEW samples

I realised after looking more closely at the paper that the LEW samples had some olivary nucleus samples which were supposed to be unaffected controls. Removing these samples meant I needed to essentially run everything again (again).

Common DEGs

Rerunning left 210 genes this time (ALS_sfblood_ALLgenes.txt).

> intersect(resultsup, PDgenes)
[1] "MSX1" "SPR" 
> intersect(resultsdown, PDgenes)
[1] "SNCA"    "ALDH1A1" "UCHL1"   "CYCS"  

Functional enrichment in EnrichR for upregulated and downregulated genes. Upregulated signal isn’t entirely clear; a number of signalling and DNA-binding pathways are enriched but not with the strongest p values. Alternatively there is quite a strong signal in the downregulated genes. A number of neurodegenerative pathways (inc Parkinson’s disease) are enriched, as well as the electron transport chain/mitochondrial processes (mitochondrion (GO:0005739) padj = 4.711e-8, 23 genes). GO Biological Process also implicates autophagy, which highlights the presence of CHMP2B - a causative ALS/FTLD gene.

Symbol Description Category GIFTs Matched Phenotypes Matched Phenotypes Scount Global Rank (Total Genes 11967) Score
SNCA Synuclein Alpha Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 2 378.27
UCHL1 Ubiquitin C-Terminal Hydrolase L1 Protein Coding 62 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 7 209.64
CYCS Cytochrome C, Somatic Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 63 63.23
NDUFV2 NADH:Ubiquinone Oxidoreductase Core Subunit V2 Protein Coding 53 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 106 51.16
ALDH1A1 Aldehyde Dehydrogenase 1 Family Member A1 Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 153 42.23
SPR Sepiapterin Reductase Protein Coding 58 Parkinson’s Disease, LRRK2, late-onset 3 126 40.38
MSX1 Msh Homeobox 1 Protein Coding 55 Parkinson’s Disease, LRRK2, late-onset 3 169 33.61
NDUFS1 NADH:Ubiquinone Oxidoreductase Core Subunit S1 Protein Coding 56 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 213 28.87
NDUFS3 NADH:Ubiquinone Oxidoreductase Core Subunit S3 Protein Coding 56 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 236 26.54
SYT11 Synaptotagmin 11 Protein Coding 50 Parkinson’s Disease, LRRK2, neurodegeneration 3 239 22.93
CASP7 Caspase 7 Protein Coding 59 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 309 21.45
TOMM20 Translocase Of Outer Mitochondrial Membrane 20 Protein Coding 48 Parkinson’s Disease, LRRK2, neurodegeneration 3 362 16.56
MFN2 Mitofusin 2 Protein Coding 59 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 424 16.55
IGF1R Insulin Like Growth Factor 1 Receptor Protein Coding 66 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 435 16.27
MAP1B Microtubule Associated Protein 1B Protein Coding 51 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 474 15.18
TMEM126B Transmembrane Protein 126B Protein Coding 47 Parkinson’s Disease, LRRK2 2 363 13.45
PTS 6-Pyruvoyltetrahydropterin Synthase Protein Coding 57 Parkinson’s Disease, late-onset, neurodegeneration 3 556 11.88
CASP6 Caspase 6 Protein Coding 57 Parkinson’s Disease, LRRK2, neurodegeneration 3 573 11.74
RAB7A RAB7A, Member RAS Oncogene Family Protein Coding 58 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 759 11.00
PPP2R2B Protein Phosphatase 2 Regulatory Subunit Bbeta Protein Coding 55 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 788 10.69
PYROXD1 Pyridine Nucleotide-Disulphide Oxidoreductase Domain 1 Protein Coding 44 Parkinson’s Disease, late-onset 2 487 10.60
AGRP Agouti Related Neuropeptide Protein Coding 51 late-onset, neurodegeneration 2 504 10.27
AIFM1 Apoptosis Inducing Factor Mitochondria Associated 1 Protein Coding 59 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 898 9.70
FLT1 Fms Related Tyrosine Kinase 1 Protein Coding 62 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 952 9.26
RTN4 Reticulon 4 Protein Coding 54 LRRK2, late-onset, neurodegeneration 3 792 9.21
DYNC1H1 Dynein Cytoplasmic 1 Heavy Chain 1 Protein Coding 54 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 965 9.11
TUBA4A Tubulin Alpha 4a Protein Coding 56 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 968 9.09
CHMP2B Charged Multivesicular Body Protein 2B Protein Coding 53 Parkinson’s Disease, late-onset, neurodegeneration 3 894 8.43
RGS2 Regulator Of G Protein Signaling 2 Protein Coding 51 Parkinson’s Disease, LRRK2, neurodegeneration 3 944 8.07
HIST1H1C Histone Cluster 1 H1 Family Member C Protein Coding 48 Parkinson’s Disease 1 520 7.11
PAWR Pro-Apoptotic WT1 Regulator Protein Coding 48 Parkinson’s Disease, neurodegeneration 2 982 6.36
MS4A6A Membrane Spanning 4-Domains A6A Protein Coding 45 Parkinson’s Disease, late-onset, neurodegeneration 3 1259 6.26
MYO1C Myosin IC Protein Coding 53 Parkinson’s Disease, LRRK2, neurodegeneration 3 1362 5.78
SUMO1 Small Ubiquitin-Like Modifier 1 Protein Coding 56 Parkinson’s Disease, neurodegeneration 2 1277 5.04
ACLY ATP Citrate Lyase Protein Coding 53 Parkinson’s Disease, LRRK2, neurodegeneration 3 1577 4.91
SDHA Succinate Dehydrogenase Complex Flavoprotein Subunit A Protein Coding 56 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 1850 4.77
PGF Placental Growth Factor Protein Coding 54 LRRK2, late-onset 2 1361 4.72
TRADD TNFRSF1A Associated Via Death Domain Protein Coding 51 Parkinson’s Disease, LRRK2, neurodegeneration 3 1694 4.57
HSPA1L Heat Shock Protein Family A (Hsp70) Member 1 Like Protein Coding 52 Parkinson’s Disease, LRRK2 2 1457 4.38
GABBR2 Gamma-Aminobutyric Acid Type B Receptor Subunit 2 Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset 3 1880 4.04
PGAM1 Phosphoglycerate Mutase 1 Protein Coding 54 Parkinson’s Disease, neurodegeneration 2 1602 3.95
ATP6V1E1 ATPase H+ Transporting V1 Subunit E1 Protein Coding 52 Parkinson’s Disease, LRRK2, late-onset 3 2008 3.70
FZD5 Frizzled Class Receptor 5 Protein Coding 57 Parkinson’s Disease, LRRK2 2 1718 3.67
MRPS22 Mitochondrial Ribosomal Protein S22 Protein Coding 50 Parkinson’s Disease, neurodegeneration 2 1776 3.54
PLIN2 Perilipin 2 Protein Coding 51 late-onset, neurodegeneration 2 1805 3.47
CIRBP Cold Inducible RNA Binding Protein Protein Coding 48 Parkinson’s Disease, LRRK2, neurodegeneration 3 2119 3.38
SNX10 Sorting Nexin 10 Protein Coding 49 Parkinson’s Disease, LRRK2 2 1913 3.22
FABP3 Fatty Acid Binding Protein 3 Protein Coding 54 Parkinson’s Disease, neurodegeneration 2 1947 3.13
BCAT2 Branched Chain Amino Acid Transaminase 2 Protein Coding 53 Parkinson’s Disease, neurodegeneration 2 1987 3.05
ADH1B Alcohol Dehydrogenase 1B (Class I), Beta Polypeptide Protein Coding 53 Parkinson’s Disease, late-onset 2 2004 3.03
RARRES2 Retinoic Acid Receptor Responder 2 Protein Coding 50 late-onset 1 1536 2.92
DNAJC24 DnaJ Heat Shock Protein Family (Hsp40) Member C24 Protein Coding 44 LRRK2, late-onset 2 2118 2.76
PRPF4 Pre-MRNA Processing Factor 4 Protein Coding 51 Parkinson’s Disease, LRRK2, neurodegeneration 3 2366 2.55
FLCN Folliculin Protein Coding 50 Parkinson’s Disease, late-onset 2 2227 2.46
COPS5 COP9 Signalosome Subunit 5 Protein Coding 51 Parkinson’s Disease, LRRK2, neurodegeneration 3 2450 2.31
ENOPH1 Enolase-Phosphatase 1 Protein Coding 49 Parkinson’s Disease 1 2186 1.83
ZC3HAV1 Zinc Finger CCCH-Type Containing, Antiviral 1 Protein Coding 46 Parkinson’s Disease, neurodegeneration 2 2481 1.81
PLCG1 Phospholipase C Gamma 1 Protein Coding 56 Parkinson’s Disease, LRRK2 2 2492 1.80
STAB1 Stabilin 1 Protein Coding 48 Parkinson’s Disease 1 2218 1.76
HP1BP3 Heterochromatin Protein 1 Binding Protein 3 Protein Coding 44 Parkinson’s Disease, neurodegeneration 2 2528 1.71
MLLT11 MLLT11, Transcription Factor 7 Cofactor Protein Coding 45 late-onset 1 2261 1.68
AGL Amylo-Alpha-1, 6-Glucosidase, 4-Alpha-Glucanotransferase Protein Coding 55 late-onset 1 2261 1.68
MAVS Mitochondrial Antiviral Signaling Protein Protein Coding 50 late-onset 1 2261 1.68
TAZ Tafazzin Protein Coding 52 late-onset 1 2261 1.68
SLC6A1 Solute Carrier Family 6 Member 1 Protein Coding 57 Parkinson’s Disease, neurodegeneration 2 2545 1.67
STAT4 Signal Transducer And Activator Of Transcription 4 Protein Coding 56 LRRK2, late-onset 2 2559 1.63
EIF4E Eukaryotic Translation Initiation Factor 4E Protein Coding 59 Parkinson’s Disease, LRRK2 2 2565 1.62
NIPSNAP1 Nipsnap Homolog 1 Protein Coding 49 Parkinson’s Disease, LRRK2 2 2565 1.62
RAD51C RAD51 Paralog C Protein Coding 51 Parkinson’s Disease, LRRK2 2 2573 1.60
NCALD Neurocalcin Delta Protein Coding 51 Parkinson’s Disease, LRRK2 2 2573 1.60
LRRC1 Leucine Rich Repeat Containing 1 Protein Coding 44 Parkinson’s Disease, LRRK2 2 2573 1.60
PKNOX1 PBX/Knotted 1 Homeobox 1 Protein Coding 47 Parkinson’s Disease, LRRK2 2 2573 1.60
GABARAPL1 GABA Type A Receptor Associated Protein Like 1 Protein Coding 53 Parkinson’s Disease, neurodegeneration 2 2609 1.51
MAP1LC3B Microtubule Associated Protein 1 Light Chain 3 Beta Protein Coding 52 Parkinson’s Disease, neurodegeneration 2 2611 1.50
LYVE1 Lymphatic Vessel Endothelial Hyaluronan Receptor 1 Protein Coding 51 late-onset, neurodegeneration 2 2651 1.41
DYNC1I1 Dynein Cytoplasmic 1 Intermediate Chain 1 Protein Coding 48 Parkinson’s Disease, neurodegeneration 2 2666 1.37
FOXO4 Forkhead Box O4 Protein Coding 51 Parkinson’s Disease, LRRK2 2 2667 1.37
NME3 NME/NM23 Nucleoside Diphosphate Kinase 3 Protein Coding 51 Parkinson’s Disease, neurodegeneration 2 2680 1.33
SERTAD3 SERTA Domain Containing 3 Protein Coding 38 neurodegeneration 1 2476 1.29
USP8 Ubiquitin Specific Peptidase 8 Protein Coding 57 Parkinson’s Disease, LRRK2, neurodegeneration 3 2776 1.27
ATG14 Autophagy Related 14 Protein Coding 43 Parkinson’s Disease 1 2551 1.17
LDLRAD4 Low Density Lipoprotein Receptor Class A Domain Containing 4 Protein Coding 40 Parkinson’s Disease 1 2551 1.17
TAF4 TATA-Box Binding Protein Associated Factor 4 Protein Coding 48 LRRK2, neurodegeneration 2 2778 1.03
RBL1 RB Transcriptional Corepressor Like 1 Protein Coding 51 LRRK2, late-onset 2 2802 0.95
IARS2 Isoleucyl-TRNA Synthetase 2, Mitochondrial Protein Coding 49 Parkinson’s Disease 1 2681 0.94
KLHL36 Kelch Like Family Member 36 Protein Coding 41 late-onset 1 2683 0.93
ST3GAL5 ST3 Beta-Galactoside Alpha-2,3-Sialyltransferase 5 Protein Coding 55 late-onset 1 2706 0.88
CLIC2 Chloride Intracellular Channel 2 Protein Coding 49 late-onset 1 2706 0.88
DDX42 DEAD-Box Helicase 42 Protein Coding 46 Parkinson’s Disease 1 2711 0.87
ANXA6 Annexin A6 Protein Coding 53 Parkinson’s Disease 1 2711 0.87
CCDC40 Coiled-Coil Domain Containing 40 Protein Coding 45 Parkinson’s Disease 1 2711 0.87
LAG3 Lymphocyte Activating 3 Protein Coding 44 Parkinson’s Disease 1 2711 0.87
SDCCAG3 Serologically Defined Colon Cancer Antigen 3 Protein Coding 41 Parkinson’s Disease 1 2711 0.87
BCAR3 BCAR3, NSP Family Adaptor Protein Protein Coding 49 Parkinson’s Disease 1 2711 0.87
IMP4 IMP4, U3 Small Nucleolar Ribonucleoprotein Protein Coding 42 Parkinson’s Disease 1 2711 0.87
LETMD1 LETM1 Domain Containing 1 Protein Coding 43 Parkinson’s Disease 1 2711 0.87
PTP4A1 Protein Tyrosine Phosphatase Type IVA, Member 1 Protein Coding 51 Parkinson’s Disease 1 2711 0.87
RCC1 Regulator Of Chromosome Condensation 1 Protein Coding 49 LRRK2, neurodegeneration 2 2829 0.86
TXNL1 Thioredoxin Like 1 Protein Coding 47 late-onset 1 2741 0.79
CD84 CD84 Molecule Protein Coding 51 late-onset 1 2806 0.66
DUSP9 Dual Specificity Phosphatase 9 Protein Coding 50 LRRK2 1 2807 0.66
FHIT Fragile Histidine Triad Protein Coding 54 Parkinson’s Disease, late-onset 2 2896 0.61
ANGPT2 Angiopoietin 2 Protein Coding 55 LRRK2, late-onset 2 2901 0.58
INHBB Inhibin Beta B Subunit Protein Coding 53 LRRK2, late-onset 2 2901 0.58
CALML4 Calmodulin Like 4 Protein Coding 41 LRRK2 1 2853 0.54
HDAC7 Histone Deacetylase 7 Protein Coding 52 LRRK2 1 2853 0.54
ATP2B2 ATPase Plasma Membrane Ca2+ Transporting 2 Protein Coding 54 LRRK2, neurodegeneration 2 2915 0.52
PPP2R1B Protein Phosphatase 2 Scaffold Subunit Abeta Protein Coding 57 LRRK2 1 2905 0.39
RGS4 Regulator Of G Protein Signaling 4 Protein Coding 52 LRRK2 1 2905 0.39
KPNA1 Karyopherin Subunit Alpha 1 Protein Coding 51 Parkinson’s Disease, neurodegeneration 2 2942 0.35
NRXN3 Neurexin 3 Protein Coding 51 neurodegeneration 1 2920 0.34
SARS Seryl-TRNA Synthetase Protein Coding 53 neurodegeneration 1 2920 0.34
PGRMC1 Progesterone Receptor Membrane Component 1 Protein Coding 53 neurodegeneration 1 2920 0.34
RDH16 Retinol Dehydrogenase 16 (All-Trans) Protein Coding 44 LRRK2 1 2936 0.28
SMAD6 SMAD Family Member 6 Protein Coding 54 LRRK2 1 2936 0.28
NFATC4 Nuclear Factor Of Activated T Cells 4 Protein Coding 53 LRRK2 1 2936 0.28
REEP5 Receptor Accessory Protein 5 Protein Coding 46 LRRK2 1 2936 0.28
EFNA4 Ephrin A4 Protein Coding 50 LRRK2 1 2936 0.28
MST1 Macrophage Stimulating 1 Protein Coding 55 LRRK2 1 2936 0.28
FSTL3 Follistatin Like 3 Protein Coding 48 LRRK2 1 2936 0.28
LAMA5 Laminin Subunit Alpha 5 Protein Coding 51 LRRK2 1 2936 0.28
PRC1 Protein Regulator Of Cytokinesis 1 Protein Coding 47 LRRK2 1 2936 0.28
LAMB2 Laminin Subunit Beta 2 Protein Coding 55 LRRK2 1 2936 0.28
ZNF629 Zinc Finger Protein 629 Protein Coding 38 LRRK2 1 2938 0.27
NR6A1 Nuclear Receptor Subfamily 6 Group A Member 1 Protein Coding 50 LRRK2 1 2938 0.27
ATP8A2 ATPase Phospholipid Transporting 8A2 Protein Coding 50 LRRK2 1 2938 0.27
AREL1 Apoptosis Resistant E3 Ubiquitin Protein Ligase 1 Protein Coding 38 LRRK2 1 2938 0.27
ZNF692 Zinc Finger Protein 692 Protein Coding 40 LRRK2 1 2938 0.27
ROBO3 Roundabout Guidance Receptor 3 Protein Coding 52 LRRK2 1 2938 0.27
ACAT2 Acetyl-CoA Acetyltransferase 2 Protein Coding 55 LRRK2 1 2938 0.27
STX2 Syntaxin 2 Protein Coding 48 LRRK2 1 2938 0.27
TCF3 Transcription Factor 3 Protein Coding 55 LRRK2 1 2938 0.27
RAB6B RAB6B, Member RAS Oncogene Family Protein Coding 46 LRRK2 1 2938 0.27
DOCK6 Dedicator Of Cytokinesis 6 Protein Coding 50 LRRK2 1 2938 0.27
INSIG2 Insulin Induced Gene 2 Protein Coding 47 LRRK2 1 2938 0.27
DNAJB9 DnaJ Heat Shock Protein Family (Hsp40) Member B9 Protein Coding 45 LRRK2 1 2938 0.27
HLTF Helicase Like Transcription Factor Protein Coding 48 LRRK2 1 2938 0.27
VPS4B Vacuolar Protein Sorting 4 Homolog B Protein Coding 51 Parkinson’s Disease 1 2952 0.18
SLC23A2 Solute Carrier Family 23 Member 2 Protein Coding 52 Parkinson’s Disease 1 2952 0.18
TULP2 Tubby Like Protein 2 Protein Coding 44 late-onset 1 2955 0.13
ALAS1 5’-Aminolevulinate Synthase 1 Protein Coding 53 late-onset 1 2955 0.13
PTBP2 Polypyrimidine Tract Binding Protein 2 Protein Coding 47 neurodegeneration 1 2959 0.07
ZMYND8 Zinc Finger MYND-Type Containing 8 Protein Coding 46 neurodegeneration 1 2959 0.07
WDYHV1 WDYHV Motif Containing 1 Protein Coding 39 neurodegeneration 1 2959 0.07
FAM193B Family With Sequence Similarity 193 Member B Protein Coding 42 neurodegeneration 1 2959 0.07
SEZ6L2 Seizure Related 6 Homolog Like 2 Protein Coding 46 neurodegeneration 1 2959 0.07

Protein Interaction Network

Using the 210 DEGs I seeded a network of 2429 nodes and 4097 edges.

library(biomaRt)

setwd("/Users/clairegreen/Documents/PhD/TDP-43/TDP-43_Code/Results/PPI_Network/")

PPI <- read.table("iref14_Human_UP_noDup_table_nodash.txt", header = T)
braingenes <- read.csv("Zhang_BrainCelltype_Markers_braingenes.csv", header = T)

DEG_list <- readLines("~/Documents/PhD/Parkinsons/Parkinsons_Code/Results/FamilialBlood/ALS_sfblood_ALLgenes.txt")

mart <- useMart("ENSEMBL_MART_ENSEMBL",dataset="hsapiens_gene_ensembl", host="www.ensembl.org")
attributes <- listAttributes(mart)
mart_back <- getBM(attributes =c("hgnc_symbol", "uniprotswissprot"), filters="hgnc_symbol", values=DEG_list,  mart=mart)
genelist_Uniprot <- subset(mart_back, !(mart_back$uniprotswissprot == ""))
setwd("/Users/clairegreen/Documents/PhD/Parkinsons/Parkinsons_Code/Results/FamilialBlood/")
write.csv(genelist_Uniprot, "martback.csv", row.names = F)

# IDENTIFY MISSING GENES AND FIND UNIPROT CODES FOR THEM. NB SOME GENES MAY NOT BE PROTEIN CODING #

mart_table <- read.csv("martback.csv", header = T) #A table with the uniprot codes for the DEGs
uniprot_gene <- mart_table$uniprotswissprot

DEG_PPI <- subset(PPI, PPI$V1 %in% uniprot_gene | PPI$V2 %in% uniprot_gene)
rownames(DEG_PPI) <- 1:nrow(DEG_PPI)

write.csv(DEG_PPI, "DEG_PPI_famBlood.csv", row.names = F)

## Convert Uniprot ID to HGNC symbol. Biomart jumbles output so
## Go To https://biodbnet-abcc.ncifcrf.gov/db/db2db.php submit each column of names. Select Uniprot Accession for input and Gene Symbol for output.
# Select "NO" for remove duplicate input valies

DEG_PPI <- read.csv("DEG_PPI_famBlood.csv")

DEG_PPI <- subset(DEG_PPI, DEG_PPI$Gene1 !="-")
DEG_PPI <- subset(DEG_PPI, DEG_PPI$Gene2 !="-")

write.csv(DEG_PPI, "FinalPDPPI.csv", row.names = F, quote = F)

PPI Network

Common Coexpression

I ran common coexpression again using Sharc. DUM was again removed because it had ~100,000 fewer edges than the other datasets. I then plotted the distributions of rho values to check they were normally distributed:

From this it is clear that the LEW and MOR.FC datasets do not have the power for a normal coexpression distribution. These two datasets were therefore removed from the common coexpression edge analysis.

When looking for a common edge, I tried three different minimum Rho values; 0.5, 0.4, and 0.3.

Threshold of 0.5

Threshold of 0.4

Threshold of 0.3

The largest connected component consists of 82 genes and 95 edges.

Largest Connected Component of 0.3

IrefIndex 14.0 PPI between PD Malacards and LCC

Here we can see that there are 25 LCC proteins which forming 45 PPI relationships with 25 PD Malacard genes (1 overlap of PARK7)

VarElect

Although already noted, the VarElect results clearly show the membership of PARK7 in the network. The interesting thing about PARK7 being here is that it is more commonly associated with early-onset or juvenile Parkinson’s disease. Another interesting point is that, although PARK7 PD is early-onset, it’s classified by UniProt as having “slow progression and initial good response to levodopa”.

CYCS or Cytochrome C, Somatic, is a small heme protein that functions as a central component of the electron transport chain in mitochondria. It is closely related to Parkinson’s disease in the literature due to its apparent malfuction in the disease.

Direct VarElect Results

Symbol Description Category GIFTs Matched Phenotypes Matched Phenotypes Scount Global Rank (Total Genes 11967) P-Value Score Average Disease Causing Likelihood
PARK7 Parkinsonism Associated Deglycase Protein Coding 56 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 5 270.87
CYCS Cytochrome C, Somatic Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 63 63.23
UBE2E3 Ubiquitin Conjugating Enzyme E2 E3 Protein Coding 51 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 119 48.51
STXBP1 Syntaxin Binding Protein 1 Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 370 18.76
VDR Vitamin D Receptor Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 380 18.20
COX5A Cytochrome C Oxidase Subunit 5A Protein Coding 53 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 536 13.99
PGK1 Phosphoglycerate Kinase 1 Protein Coding 59 Parkinson’s Disease, late-onset, neurodegeneration 3 513 12.40
NDUFA8 NADH:Ubiquinone Oxidoreductase Subunit A8 Protein Coding 50 Parkinson’s Disease, LRRK2, late-onset 3 641 10.77
EIF4EBP1 Eukaryotic Translation Initiation Factor 4E Binding Protein 1 Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset 3 715 9.90
DLD Dihydrolipoamide Dehydrogenase Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 899 9.68
PSMC6 Proteasome 26S Subunit, ATPase 6 Protein Coding 50 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 925 9.51
PLA2G2A Phospholipase A2 Group IIA Protein Coding 56 Parkinson’s Disease, late-onset, neurodegeneration 3 770 9.43
RTN4 Reticulon 4 Protein Coding 54 LRRK2, late-onset, neurodegeneration 3 792 9.21
NDUFA5 NADH:Ubiquinone Oxidoreductase Subunit A5 Protein Coding 51 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 998 8.87
UQCRC2 Ubiquinol-Cytochrome C Reductase Core Protein 2 Protein Coding 55 Parkinson’s Disease, LRRK2, neurodegeneration 3 1033 7.46
RAB11A RAB11A, Member RAS Oncogene Family Protein Coding 55 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 1321 6.93
SFXN1 Sideroflexin 1 Protein Coding 47 Parkinson’s Disease, LRRK2, neurodegeneration 3 1241 6.34
RAB1A RAB1A, Member RAS Oncogene Family Protein Coding 51 Parkinson’s Disease, LRRK2, neurodegeneration 3 1282 6.14
PSMC5 Proteasome 26S Subunit, ATPase 5 Protein Coding 51 Parkinson’s Disease, LRRK2, neurodegeneration 3 1311 6.03
PSMA7 Proteasome Subunit Alpha 7 Protein Coding 55 Parkinson’s Disease, LRRK2, neurodegeneration 3 1482 5.27
SUMO1 Small Ubiquitin-Like Modifier 1 Protein Coding 56 Parkinson’s Disease, neurodegeneration 2 1277 5.04
ACLY ATP Citrate Lyase Protein Coding 53 Parkinson’s Disease, LRRK2, neurodegeneration 3 1577 4.91
SDHA Succinate Dehydrogenase Complex Flavoprotein Subunit A Protein Coding 56 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 1850 4.77
PSMD6 Proteasome 26S Subunit, Non-ATPase 6 Protein Coding 47 Parkinson’s Disease, LRRK2, neurodegeneration 3 1696 4.56
UQCRC1 Ubiquinol-Cytochrome C Reductase Core Protein 1 Protein Coding 50 Parkinson’s Disease, LRRK2, neurodegeneration 3 1746 4.43
PDHB Pyruvate Dehydrogenase E1 Beta Subunit Protein Coding 56 Parkinson’s Disease, LRRK2, neurodegeneration 3 1781 4.32
NR4A1 Nuclear Receptor Subfamily 4 Group A Member 1 Protein Coding 55 Parkinson’s Disease, LRRK2, neurodegeneration 3 1865 4.09
IARS Isoleucyl-TRNA Synthetase Protein Coding 55 Parkinson’s Disease, neurodegeneration 2 1597 3.97
ATP6V1E1 ATPase H+ Transporting V1 Subunit E1 Protein Coding 52 Parkinson’s Disease, LRRK2, late-onset 3 2008 3.70
MAP2K1 Mitogen-Activated Protein Kinase Kinase 1 Protein Coding 65 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 2197 3.60
PPP2CA Protein Phosphatase 2 Catalytic Subunit Alpha Protein Coding 56 Parkinson’s Disease, LRRK2, neurodegeneration 3 2050 3.58
TRIM37 Tripartite Motif Containing 37 Protein Coding 51 neurodegeneration 1 1309 3.49
UQCRFS1 Ubiquinol-Cytochrome C Reductase, Rieske Iron-Sulfur Polypeptide 1 Protein Coding 51 Parkinson’s Disease, LRRK2, neurodegeneration 3 2137 3.31
VAV1 Vav Guanine Nucleotide Exchange Factor 1 Protein Coding 57 LRRK2, late-onset 2 1938 3.15
RUNX2 Runt Related Transcription Factor 2 Protein Coding 57 late-onset 1 1481 3.04
CUL3 Cullin 3 Protein Coding 55 Parkinson’s Disease, LRRK2, neurodegeneration 3 2349 2.62
PPIA Peptidylprolyl Isomerase A Protein Coding 57 Parkinson’s Disease, LRRK2, neurodegeneration 3 2366 2.55
UBE2K Ubiquitin Conjugating Enzyme E2 K Protein Coding 51 Parkinson’s Disease, LRRK2, neurodegeneration 3 2388 2.49
COPS5 COP9 Signalosome Subunit 5 Protein Coding 51 Parkinson’s Disease, LRRK2, neurodegeneration 3 2450 2.31
PDHA1 Pyruvate Dehydrogenase E1 Alpha 1 Subunit Protein Coding 59 Parkinson’s Disease, LRRK2, neurodegeneration 3 2547 2.04
DDX1 DEAD-Box Helicase 1 Protein Coding 51 Parkinson’s Disease, neurodegeneration 2 2528 1.71
ANXA7 Annexin A7 Protein Coding 51 late-onset, neurodegeneration 2 2571 1.60
YARS Tyrosyl-TRNA Synthetase Protein Coding 53 late-onset, neurodegeneration 2 2576 1.60
PSMD12 Proteasome 26S Subunit, Non-ATPase 12 Protein Coding 49 Parkinson’s Disease, LRRK2, neurodegeneration 3 2710 1.50
EPS15 Epidermal Growth Factor Receptor Pathway Substrate 15 Protein Coding 54 Parkinson’s Disease, LRRK2 2 2701 1.26
PIK3CB Phosphatidylinositol-4,5-Bisphosphate 3-Kinase Catalytic Subunit Beta Protein Coding 57 LRRK2, neurodegeneration 2 2746 1.11
MDH1 Malate Dehydrogenase 1 Protein Coding 55 Parkinson’s Disease, neurodegeneration 2 2770 1.04
ATP6V1B2 ATPase H+ Transporting V1 Subunit B2 Protein Coding 56 LRRK2, neurodegeneration 2 2775 1.03
GABARAPL2 GABA Type A Receptor Associated Protein Like 2 Protein Coding 55 Parkinson’s Disease 1 2652 0.99
IARS2 Isoleucyl-TRNA Synthetase 2, Mitochondrial Protein Coding 49 Parkinson’s Disease 1 2681 0.94
STRAP Serine/Threonine Kinase Receptor Associated Protein Protein Coding 45 LRRK2, neurodegeneration 2 2823 0.88
FOSL1 FOS Like 1, AP-1 Transcription Factor Subunit Protein Coding 52 LRRK2, neurodegeneration 2 2826 0.87
ICK Intestinal Cell Kinase Protein Coding 53 Parkinson’s Disease 1 2711 0.87
TXNL1 Thioredoxin Like 1 Protein Coding 47 late-onset 1 2741 0.79
TMEM30A Transmembrane Protein 30A Protein Coding 46 Parkinson’s Disease, LRRK2 2 2895 0.62
RRAS RAS Related Protein Coding 52 LRRK2 1 2853 0.54
PRDM4 PR/SET Domain 4 Protein Coding 43 LRRK2 1 2853 0.54
NPTN Neuroplastin Protein Coding 49 neurodegeneration 1 2878 0.49
ATP6V1H ATPase H+ Transporting V1 Subunit H Protein Coding 48 LRRK2 1 2905 0.39
NME1 NME/NM23 Nucleoside Diphosphate Kinase 1 Protein Coding 57 neurodegeneration 1 2920 0.34
REEP5 Receptor Accessory Protein 5 Protein Coding 46 LRRK2 1 2936 0.28
SUCLA2 Succinate-CoA Ligase ADP-Forming Beta Subunit Protein Coding 57 LRRK2 1 2936 0.28
BCAS2 BCAS2, Pre-MRNA Processing Factor Protein Coding 46 LRRK2 1 2938 0.27
SF3A2 Splicing Factor 3a Subunit 2 Protein Coding 46 LRRK2 1 2938 0.27
HLTF Helicase Like Transcription Factor Protein Coding 48 LRRK2 1 2938 0.27
RAB22A RAB22A, Member RAS Oncogene Family Protein Coding 44 LRRK2 1 2938 0.27
RAN RAN, Member RAS Oncogene Family Protein Coding 54 neurodegeneration 1 2958 0.10
PTBP2 Polypyrimidine Tract Binding Protein 2 Protein Coding 47 neurodegeneration 1 2959 0.07

Indirect VarElect Results

Implicated Symbol Implicating Symbol Description Category GIFTs Matched Phenotypes Matched Phenotypes Count Global Rank (Total Genes 11967) Score (Implicated)
UBE2V2 PRKN Parkin RBR E3 Ubiquitin Protein Ligase Protein Coding 47 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 3 23.14
UBE2V2 RPS27A Ribosomal Protein S27a Protein Coding 53 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 45 23.14
UBE2V2 FBXO7 F-Box Protein 7 Protein Coding 50 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 16 23.14
UBE2V2 UCHL1 Ubiquitin C-Terminal Hydrolase L1 Protein Coding 62 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 7 23.14
UBE2V2 UBE2E2 Ubiquitin Conjugating Enzyme E2 E2 Protein Coding 49 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 145 23.14
SNAP91 SNCA Synuclein Alpha Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 2 13.14
SNAP91 PRKN Parkin RBR E3 Ubiquitin Protein Ligase Protein Coding 47 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 3 13.14
SNAP91 SYNJ1 Synaptojanin 1 Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 33 13.14
SNAP91 DNAJC6 DnaJ Heat Shock Protein Family (Hsp40) Member C6 Protein Coding 51 Parkinson’s Disease, LRRK2, late-onset 3 27 13.14
SNAP91 UCHL1 Ubiquitin C-Terminal Hydrolase L1 Protein Coding 62 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 7 13.14
TERF2IP UCHL1 Ubiquitin C-Terminal Hydrolase L1 Protein Coding 62 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 7 12.55
TERF2IP SNCA Synuclein Alpha Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 2 12.55
TERF2IP AKT1 AKT Serine/Threonine Kinase 1 Protein Coding 65 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 79 12.55
TERF2IP RPS27A Ribosomal Protein S27a Protein Coding 53 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 45 12.55
TERF2IP TBP TATA-Box Binding Protein Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 40 12.55
PYCARD BAX BCL2 Associated X, Apoptosis Regulator Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 107 9.78
PYCARD NLRP3 NLR Family Pyrin Domain Containing 3 Protein Coding 56 Parkinson’s Disease, late-onset 2 222 9.78
PYCARD PRKN Parkin RBR E3 Ubiquitin Protein Ligase Protein Coding 47 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 3 9.78
PYCARD CASP3 Caspase 3 Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 56 9.78
PYCARD SQSTM1 Sequestosome 1 Protein Coding 59 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 48 9.78
SRPRB SNCA Synuclein Alpha Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 2 8.42
SRPRB PRKN Parkin RBR E3 Ubiquitin Protein Ligase Protein Coding 47 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 3 8.42
SRPRB RPS27A Ribosomal Protein S27a Protein Coding 53 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 45 8.42
SRPRB UCHL1 Ubiquitin C-Terminal Hydrolase L1 Protein Coding 62 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 7 8.42
SRPRB SNCAIP Synuclein Alpha Interacting Protein Protein Coding 52 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 8 8.42
EIF1B EIF4G1 Eukaryotic Translation Initiation Factor 4 Gamma 1 Protein Coding 55 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 36 7.94
EIF1B UCHL1 Ubiquitin C-Terminal Hydrolase L1 Protein Coding 62 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 7 7.94
EIF1B RPS27A Ribosomal Protein S27a Protein Coding 53 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 45 7.94
EIF1B SOD2 Superoxide Dismutase 2 Protein Coding 60 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 78 7.94
EIF1B GAPDH Glyceraldehyde-3-Phosphate Dehydrogenase Protein Coding 58 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 98 7.94
MRPL15 PARK7 Parkinsonism Associated Deglycase Protein Coding 56 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 5 7.78
MRPL15 RPS27A Ribosomal Protein S27a Protein Coding 53 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 45 7.78
MRPL15 NDUFS4 NADH:Ubiquinone Oxidoreductase Subunit S4 Protein Coding 52 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 31 7.78
MRPL15 TH Tyrosine Hydroxylase Protein Coding 62 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 24 7.78
MRPL15 EIF4G1 Eukaryotic Translation Initiation Factor 4 Gamma 1 Protein Coding 55 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 36 7.78
LYL1 GDNF Glial Cell Derived Neurotrophic Factor Protein Coding 59 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 52 7.74
LYL1 PLAU Plasminogen Activator, Urokinase Protein Coding 63 LRRK2, late-onset 2 162 7.74
LYL1 NFKB1 Nuclear Factor Kappa B Subunit 1 Protein Coding 63 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 409 7.74
LYL1 ZEB1 Zinc Finger E-Box Binding Homeobox 1 Protein Coding 57 late-onset 1 273 7.74
LYL1 BAX BCL2 Associated X, Apoptosis Regulator Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 107 7.74
XRCC2 RPS27A Ribosomal Protein S27a Protein Coding 53 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 45 6.43
XRCC2 NBN Nibrin Protein Coding 55 Parkinson’s Disease, late-onset, neurodegeneration 3 388 6.43
XRCC2 TP53 Tumor Protein P53 Protein Coding 65 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 201 6.43
XRCC2 COMT Catechol-O-Methyltransferase Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 34 6.43
XRCC2 ATM ATM Serine/Threonine Kinase Protein Coding 64 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 498 6.43
BMP8B PLA2G6 Phospholipase A2 Group VI Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 12 6.27
BMP8B BDNF Brain Derived Neurotrophic Factor Protein Coding 59 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 42 6.27
BMP8B MAPK14 Mitogen-Activated Protein Kinase 14 Protein Coding 62 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 89 6.27
BMP8B IL10 Interleukin 10 Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 243 6.27
BMP8B GDNF Glial Cell Derived Neurotrophic Factor Protein Coding 59 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 52 6.27
LAPTM4B GBA Glucosylceramidase Beta Protein Coding 59 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 11 5.59
LAPTM4B ABCB1 ATP Binding Cassette Subfamily B Member 1 Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 449 5.59
LAPTM4B NEDD4 Neural Precursor Cell Expressed, Developmentally Down-Regulated 4, E3 Ubiquitin Protein Ligase Protein Coding 55 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 669 5.59
LAPTM4B SLC11A2 Solute Carrier Family 11 Member 2 Protein Coding 58 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 132 5.59
LAPTM4B NFE2L2 Nuclear Factor, Erythroid 2 Like 2 Protein Coding 56 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 351 5.59
ADSS GBA Glucosylceramidase Beta Protein Coding 59 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 11 4.39
ADSS PLA2G6 Phospholipase A2 Group VI Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 12 4.39
ADSS GAD1 Glutamate Decarboxylase 1 Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 120 4.39
ADSS APOE Apolipoprotein E Protein Coding 61 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 13 4.39
ADSS MAPK14 Mitogen-Activated Protein Kinase 14 Protein Coding 62 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 89 4.39
BTBD3 CUL1 Cullin 1 Protein Coding 55 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 143 1.47
BTBD3 SKP1 S-Phase Kinase Associated Protein 1 Protein Coding 52 Parkinson’s Disease, LRRK2, neurodegeneration 3 310 1.47
BTBD3 E2F1 E2F Transcription Factor 1 Protein Coding 53 Parkinson’s Disease, LRRK2, neurodegeneration 3 1777 1.47
BTBD3 DAPK1 Death Associated Protein Kinase 1 Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 823 1.47
BTBD3 PLCB1 Phospholipase C Beta 1 Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 1019 1.47
RTCA LMNB1 Lamin B1 Protein Coding 57 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 228 1.23
RTCA NOP56 NOP56 Ribonucleoprotein Protein Coding 51 Parkinson’s Disease, late-onset, neurodegeneration 3 373 1.23
RTCA IL6 Interleukin 6 Protein Coding 59 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 197 1.23
RTCA HNRNPA1 Heterogeneous Nuclear Ribonucleoprotein A1 Protein Coding 56 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 538 1.23
RTCA TNF Tumor Necrosis Factor Protein Coding 63 Parkinson’s Disease, LRRK2, late-onset, neurodegeneration 4 185 1.23

EnrichR Enrichment

KEGG

KEGG

Wikipathways

Wikipathways

Reactome

Reactome

GO Cellular Component

GO Cellular Component

GO Biological Process

GO Biological Process

GO Molecular Function

GO Molecular Function